Overview

Brought to you by YData

Dataset statistics

Number of variables138
Number of observations3814099
Missing cells392091022
Missing cells (%)74.5%
Total size in memory3.9 GiB
Average record size in memory1.1 KiB

Variable types

Text138

Dataset

DescriptionNMNH Extant Specimen Records (USNM, US) 0049395-241126133413365
URLhttps://doi.org/10.15468/dl.42mnjx

Alerts

datasetName has constant value "NMNH Extant Biology" Constant
reproductiveCondition has constant value "Animalia, Chordata, Vertebrata, Amphibia, Anura, Bufonidae" Constant
caste has constant value "Animalia" Constant
behavior has constant value "Chordata" Constant
vitality has constant value "Amphibia" Constant
establishmentMeans has constant value "Anura" Constant
pathway has constant value "Bufonidae" Constant
disposition has constant value "Rhinella" Constant
verbatimLabel has constant value "North America, Canada, Nunavut, Baffin Island" Constant
materialSampleID has constant value "North America" Constant
eventTime has constant value "Nunavut" Constant
sampleSizeValue has constant value "1000.0" Constant
eventRemarks has constant value "GPS" Constant
municipality has constant value "Degrees Minutes Seconds" Constant
locationRemarks has constant value "DeFilipps, R. A." Constant
georeferenceSources has constant value "29 Mar 1889" Constant
latestPeriodOrHighestSystem has constant value "177" Constant
bed has constant value "Riccardia pinguis" Constant
identificationID has constant value "Guadalupe Island, Baja California." Constant
taxonID has constant value "Metzgeriales" Constant
parentNameUsageID has constant value "Plantae, Dicotyledonae (basal), Magnoliales, Annonaceae, Annonoideae" Constant
originalNameUsageID has constant value "Plantae" Constant
taxonConceptID has constant value "Magnoliales" Constant
parentNameUsage has constant value "pinguis" Constant
namePublishedIn has constant value "Guatteria" Constant
namePublishedInYear has constant value "(K. Mert. ex Roth) Derbes & Solier" Constant
subfamily has constant value "(Aubl.) R.A. Howard" Constant
taxonomicStatus has constant value "Chordata" Constant
catalogNumber has 344412 (9.0%) missing values Missing
recordNumber has 1688619 (44.3%) missing values Missing
recordedBy has 806049 (21.1%) missing values Missing
sex has 3118857 (81.8%) missing values Missing
lifeStage has 3359653 (88.1%) missing values Missing
reproductiveCondition has 3814098 (> 99.9%) missing values Missing
caste has 3814098 (> 99.9%) missing values Missing
behavior has 3814098 (> 99.9%) missing values Missing
vitality has 3814098 (> 99.9%) missing values Missing
establishmentMeans has 3814098 (> 99.9%) missing values Missing
pathway has 3814098 (> 99.9%) missing values Missing
preparations has 1975286 (51.8%) missing values Missing
disposition has 3814098 (> 99.9%) missing values Missing
associatedMedia has 1396847 (36.6%) missing values Missing
associatedSequences has 3809026 (99.9%) missing values Missing
occurrenceRemarks has 3306658 (86.7%) missing values Missing
organismName has 3814097 (> 99.9%) missing values Missing
verbatimLabel has 3814098 (> 99.9%) missing values Missing
materialSampleID has 3814098 (> 99.9%) missing values Missing
eventType has 3814096 (> 99.9%) missing values Missing
fieldNumber has 3496495 (91.7%) missing values Missing
eventDate has 653351 (17.1%) missing values Missing
eventTime has 3814098 (> 99.9%) missing values Missing
startDayOfYear has 806907 (21.2%) missing values Missing
endDayOfYear has 805827 (21.1%) missing values Missing
year has 653351 (17.1%) missing values Missing
month has 799915 (21.0%) missing values Missing
day has 1074234 (28.2%) missing values Missing
verbatimEventDate has 2027788 (53.2%) missing values Missing
habitat has 3516278 (92.2%) missing values Missing
sampleSizeValue has 3814098 (> 99.9%) missing values Missing
eventRemarks has 3814098 (> 99.9%) missing values Missing
locationID has 3366761 (88.3%) missing values Missing
higherGeography has 118692 (3.1%) missing values Missing
continent has 534327 (14.0%) missing values Missing
waterBody has 3107446 (81.5%) missing values Missing
islandGroup has 3729526 (97.8%) missing values Missing
island has 3560499 (93.4%) missing values Missing
country has 160727 (4.2%) missing values Missing
stateProvince has 1028496 (27.0%) missing values Missing
county has 2948235 (77.3%) missing values Missing
municipality has 3814098 (> 99.9%) missing values Missing
locality has 544962 (14.3%) missing values Missing
verbatimLocality has 3814096 (> 99.9%) missing values Missing
minimumElevationInMeters has 2930460 (76.8%) missing values Missing
maximumElevationInMeters has 3486461 (91.4%) missing values Missing
verbatimElevation has 3703697 (97.1%) missing values Missing
minimumDepthInMeters has 3390497 (88.9%) missing values Missing
maximumDepthInMeters has 3423246 (89.8%) missing values Missing
verbatimDepth has 3790849 (99.4%) missing values Missing
locationRemarks has 3814098 (> 99.9%) missing values Missing
decimalLatitude has 2665103 (69.9%) missing values Missing
decimalLongitude has 2665103 (69.9%) missing values Missing
geodeticDatum has 3696977 (96.9%) missing values Missing
coordinateUncertaintyInMeters has 3744590 (98.2%) missing values Missing
coordinatePrecision has 3814096 (> 99.9%) missing values Missing
pointRadiusSpatialFit has 3814095 (> 99.9%) missing values Missing
verbatimCoordinates has 3814093 (> 99.9%) missing values Missing
verbatimLatitude has 3492892 (91.6%) missing values Missing
verbatimLongitude has 3493424 (91.6%) missing values Missing
verbatimCoordinateSystem has 3396655 (89.1%) missing values Missing
verbatimSRS has 3814097 (> 99.9%) missing values Missing
footprintSRS has 3814097 (> 99.9%) missing values Missing
footprintSpatialFit has 3814091 (> 99.9%) missing values Missing
georeferencedBy has 3814097 (> 99.9%) missing values Missing
georeferencedDate has 3814097 (> 99.9%) missing values Missing
georeferenceProtocol has 3320409 (87.1%) missing values Missing
georeferenceSources has 3814098 (> 99.9%) missing values Missing
georeferenceRemarks has 3730205 (97.8%) missing values Missing
geologicalContextID has 3814092 (> 99.9%) missing values Missing
earliestEonOrLowestEonothem has 3814086 (> 99.9%) missing values Missing
latestEonOrHighestEonothem has 3814091 (> 99.9%) missing values Missing
earliestEraOrLowestErathem has 3814096 (> 99.9%) missing values Missing
latestEraOrHighestErathem has 3814093 (> 99.9%) missing values Missing
earliestPeriodOrLowestSystem has 3814085 (> 99.9%) missing values Missing
latestPeriodOrHighestSystem has 3814098 (> 99.9%) missing values Missing
earliestEpochOrLowestSeries has 3814085 (> 99.9%) missing values Missing
latestEpochOrHighestSeries has 3814094 (> 99.9%) missing values Missing
earliestAgeOrLowestStage has 3814097 (> 99.9%) missing values Missing
latestAgeOrHighestStage has 3814092 (> 99.9%) missing values Missing
lowestBiostratigraphicZone has 3814093 (> 99.9%) missing values Missing
highestBiostratigraphicZone has 3814097 (> 99.9%) missing values Missing
lithostratigraphicTerms has 3814097 (> 99.9%) missing values Missing
formation has 3814092 (> 99.9%) missing values Missing
member has 3814097 (> 99.9%) missing values Missing
bed has 3814098 (> 99.9%) missing values Missing
identificationID has 3814098 (> 99.9%) missing values Missing
identificationQualifier has 3799723 (99.6%) missing values Missing
typeStatus has 3664511 (96.1%) missing values Missing
identifiedBy has 3157857 (82.8%) missing values Missing
identifiedByID has 3814092 (> 99.9%) missing values Missing
dateIdentified has 3814090 (> 99.9%) missing values Missing
identificationReferences has 3814093 (> 99.9%) missing values Missing
identificationVerificationStatus has 3814095 (> 99.9%) missing values Missing
identificationRemarks has 3814095 (> 99.9%) missing values Missing
taxonID has 3814098 (> 99.9%) missing values Missing
scientificNameID has 3814097 (> 99.9%) missing values Missing
acceptedNameUsageID has 3814096 (> 99.9%) missing values Missing
parentNameUsageID has 3814098 (> 99.9%) missing values Missing
originalNameUsageID has 3814098 (> 99.9%) missing values Missing
nameAccordingToID has 3814097 (> 99.9%) missing values Missing
namePublishedInID has 3814096 (> 99.9%) missing values Missing
taxonConceptID has 3814098 (> 99.9%) missing values Missing
scientificName has 152724 (4.0%) missing values Missing
acceptedNameUsage has 3814096 (> 99.9%) missing values Missing
parentNameUsage has 3814098 (> 99.9%) missing values Missing
originalNameUsage has 3814097 (> 99.9%) missing values Missing
namePublishedIn has 3814098 (> 99.9%) missing values Missing
namePublishedInYear has 3814098 (> 99.9%) missing values Missing
phylum has 1562087 (41.0%) missing values Missing
class has 102065 (2.7%) missing values Missing
order has 410734 (10.8%) missing values Missing
family has 101008 (2.6%) missing values Missing
subfamily has 3814098 (> 99.9%) missing values Missing
genus has 162837 (4.3%) missing values Missing
subgenus has 3729484 (97.8%) missing values Missing
infragenericEpithet has 3814097 (> 99.9%) missing values Missing
specificEpithet has 190700 (5.0%) missing values Missing
infraspecificEpithet has 3381784 (88.7%) missing values Missing
taxonRank has 3381907 (88.7%) missing values Missing
scientificNameAuthorship has 1431500 (37.5%) missing values Missing
vernacularName has 3814096 (> 99.9%) missing values Missing
nomenclaturalCode has 3814094 (> 99.9%) missing values Missing
taxonomicStatus has 3814098 (> 99.9%) missing values Missing
nomenclaturalStatus has 3814097 (> 99.9%) missing values Missing
taxonRemarks has 3814097 (> 99.9%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique

Reproduction

Analysis started2025-01-14 16:35:58.342698
Analysis finished2025-01-14 16:38:41.454503
Duration2 minutes and 43.11 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct3814099
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size29.1 MiB
2025-01-14T11:38:43.449066image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters38140990
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3814099 ?
Unique (%)100.0%

Sample

1st row1321585620
2nd row2452323322
3rd row1321585780
4th row1320143695
5th row2397792128
ValueCountFrequency (%)
1321585620 1
 
< 0.1%
1321586280 1
 
< 0.1%
1321587590 1
 
< 0.1%
1321587488 1
 
< 0.1%
1320147229 1
 
< 0.1%
1320145108 1
 
< 0.1%
1321585780 1
 
< 0.1%
1320143695 1
 
< 0.1%
2397792128 1
 
< 0.1%
1320143630 1
 
< 0.1%
Other values (3814089) 3814089
> 99.9%
2025-01-14T11:38:45.376461image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 6765665
17.7%
3 5422546
14.2%
2 5098023
13.4%
5 3086322
8.1%
8 3049375
8.0%
7 3042848
8.0%
0 2990957
7.8%
4 2929325
7.7%
6 2880749
7.6%
9 2875180
7.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 38140990
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 6765665
17.7%
3 5422546
14.2%
2 5098023
13.4%
5 3086322
8.1%
8 3049375
8.0%
7 3042848
8.0%
0 2990957
7.8%
4 2929325
7.7%
6 2880749
7.6%
9 2875180
7.5%

Most occurring scripts

ValueCountFrequency (%)
Common 38140990
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 6765665
17.7%
3 5422546
14.2%
2 5098023
13.4%
5 3086322
8.1%
8 3049375
8.0%
7 3042848
8.0%
0 2990957
7.8%
4 2929325
7.7%
6 2880749
7.6%
9 2875180
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38140990
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 6765665
17.7%
3 5422546
14.2%
2 5098023
13.4%
5 3086322
8.1%
8 3049375
8.0%
7 3042848
8.0%
0 2990957
7.8%
4 2929325
7.7%
6 2880749
7.6%
9 2875180
7.5%
Distinct286119
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Memory size29.1 MiB
2025-01-14T11:38:45.582277image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters72467881
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique125475 ?
Unique (%)3.3%

Sample

1st row2023-05-10 09:22:00
2nd row2022-01-03 14:31:00
3rd row2022-08-17 11:23:00
4th row2022-12-30 12:34:00
5th row2019-07-10 10:37:00
ValueCountFrequency (%)
2024-09-25 284800
 
3.7%
2022-09-22 111266
 
1.5%
2018-09-17 106629
 
1.4%
2017-08-04 96033
 
1.3%
2022-10-26 86404
 
1.1%
2022-08-17 68138
 
0.9%
2022-03-25 66124
 
0.9%
2022-06-03 50303
 
0.7%
2018-10-02 48765
 
0.6%
2022-09-08 40115
 
0.5%
Other values (4857) 6669621
87.4%
2025-01-14T11:38:45.842852image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18925013
26.1%
2 10663155
14.7%
1 9193210
12.7%
- 7628198
10.5%
: 7628198
10.5%
3814099
 
5.3%
4 2499475
 
3.4%
3 2486901
 
3.4%
5 2329941
 
3.2%
9 2243456
 
3.1%
Other values (3) 5056235
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 53397386
73.7%
Dash Punctuation 7628198
 
10.5%
Other Punctuation 7628198
 
10.5%
Space Separator 3814099
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18925013
35.4%
2 10663155
20.0%
1 9193210
17.2%
4 2499475
 
4.7%
3 2486901
 
4.7%
5 2329941
 
4.4%
9 2243456
 
4.2%
8 1862086
 
3.5%
7 1735441
 
3.3%
6 1458708
 
2.7%
Dash Punctuation
ValueCountFrequency (%)
- 7628198
100.0%
Other Punctuation
ValueCountFrequency (%)
: 7628198
100.0%
Space Separator
ValueCountFrequency (%)
3814099
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 72467881
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18925013
26.1%
2 10663155
14.7%
1 9193210
12.7%
- 7628198
10.5%
: 7628198
10.5%
3814099
 
5.3%
4 2499475
 
3.4%
3 2486901
 
3.4%
5 2329941
 
3.2%
9 2243456
 
3.1%
Other values (3) 5056235
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 72467881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18925013
26.1%
2 10663155
14.7%
1 9193210
12.7%
- 7628198
10.5%
: 7628198
10.5%
3814099
 
5.3%
4 2499475
 
3.4%
3 2486901
 
3.4%
5 2329941
 
3.2%
9 2243456
 
3.1%
Other values (3) 5056235
 
7.0%
Distinct43
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.1 MiB
2025-01-14T11:38:45.923779image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length28.98748223
Min length2

Characters and Unicode

Total characters110561127
Distinct characters41
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowurn:lsid:biocol.org:col:34871
2nd rowurn:lsid:biocol.org:col:15463
3rd rowurn:lsid:biocol.org:col:34871
4th rowurn:lsid:biocol.org:col:34871
5th rowurn:lsid:biocol.org:col:34871
ValueCountFrequency (%)
urn:lsid:biocol.org:col:34871 1956033
51.3%
urn:lsid:biocol.org:col:15463 1856185
48.7%
nsmt 425
 
< 0.1%
uam 339
 
< 0.1%
rmnh 146
 
< 0.1%
nrm 137
 
< 0.1%
nmv 112
 
< 0.1%
rcs 95
 
< 0.1%
nmsz 77
 
< 0.1%
zmmu 70
 
< 0.1%
Other values (33) 480
 
< 0.1%
2025-01-14T11:38:46.048634image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 15248872
13.8%
: 15248872
13.8%
l 11436654
 
10.3%
c 7624436
 
6.9%
i 7624436
 
6.9%
r 7624436
 
6.9%
s 3812218
 
3.4%
d 3812218
 
3.4%
b 3812218
 
3.4%
n 3812218
 
3.4%
Other values (31) 30504549
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 72432142
65.5%
Other Punctuation 19061090
 
17.2%
Decimal Number 19061090
 
17.2%
Uppercase Letter 6805
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 1841
27.1%
N 1093
16.1%
S 749
11.0%
A 564
 
8.3%
U 496
 
7.3%
T 425
 
6.2%
R 392
 
5.8%
H 233
 
3.4%
C 212
 
3.1%
Z 195
 
2.9%
Other values (11) 605
 
8.9%
Lowercase Letter
ValueCountFrequency (%)
o 15248872
21.1%
l 11436654
15.8%
c 7624436
10.5%
i 7624436
10.5%
r 7624436
10.5%
s 3812218
 
5.3%
d 3812218
 
5.3%
b 3812218
 
5.3%
n 3812218
 
5.3%
g 3812218
 
5.3%
Decimal Number
ValueCountFrequency (%)
3 3812218
20.0%
4 3812218
20.0%
1 3812218
20.0%
8 1956033
10.3%
7 1956033
10.3%
5 1856185
9.7%
6 1856185
9.7%
Other Punctuation
ValueCountFrequency (%)
: 15248872
80.0%
. 3812218
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 72438947
65.5%
Common 38122180
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 15248872
21.1%
l 11436654
15.8%
c 7624436
10.5%
i 7624436
10.5%
r 7624436
10.5%
s 3812218
 
5.3%
d 3812218
 
5.3%
b 3812218
 
5.3%
n 3812218
 
5.3%
g 3812218
 
5.3%
Other values (22) 3819023
 
5.3%
Common
ValueCountFrequency (%)
: 15248872
40.0%
. 3812218
 
10.0%
3 3812218
 
10.0%
4 3812218
 
10.0%
1 3812218
 
10.0%
8 1956033
 
5.1%
7 1956033
 
5.1%
5 1856185
 
4.9%
6 1856185
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 110561127
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 15248872
13.8%
: 15248872
13.8%
l 11436654
 
10.3%
c 7624436
 
6.9%
i 7624436
 
6.9%
r 7624436
 
6.9%
s 3812218
 
3.4%
d 3812218
 
3.4%
b 3812218
 
3.4%
n 3812218
 
3.4%
Other values (31) 30504549
27.6%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.1 MiB
2025-01-14T11:38:46.114233image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters171634455
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
2nd rowurn:uuid:60e28f81-e634-4869-aa3e-732caed713c8
3rd rowurn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0
4th rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
5th rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
ValueCountFrequency (%)
urn:uuid:60e28f81-e634-4869-aa3e-732caed713c8 1856185
48.7%
urn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6 792284
20.8%
urn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad 249390
 
6.5%
urn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22 247291
 
6.5%
urn:uuid:73d83e23-1999-42cd-b38a-c06a7d32d893 240577
 
6.3%
urn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0 240491
 
6.3%
urn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f 187881
 
4.9%
2025-01-14T11:38:46.240173image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 15256396
 
8.9%
8 13115302
 
7.6%
d 11592424
 
6.8%
u 11442297
 
6.7%
3 10471017
 
6.1%
e 9689355
 
5.6%
c 9414596
 
5.5%
1 9256391
 
5.4%
a 8567826
 
5.0%
6 8270441
 
4.8%
Other values (12) 64558410
37.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 76347338
44.5%
Lowercase Letter 72402523
42.2%
Dash Punctuation 15256396
 
8.9%
Other Punctuation 7628198
 
4.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 13115302
17.2%
3 10471017
13.7%
1 9256391
12.1%
6 8270441
10.8%
2 7804315
10.2%
4 7742634
10.1%
7 6996246
9.2%
9 6388438
8.4%
0 4500357
 
5.9%
5 1802197
 
2.4%
Lowercase Letter
ValueCountFrequency (%)
d 11592424
16.0%
u 11442297
15.8%
e 9689355
13.4%
c 9414596
13.0%
a 8567826
11.8%
f 6150105
8.5%
b 4103623
 
5.7%
r 3814099
 
5.3%
i 3814099
 
5.3%
n 3814099
 
5.3%
Dash Punctuation
ValueCountFrequency (%)
- 15256396
100.0%
Other Punctuation
ValueCountFrequency (%)
: 7628198
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 99231932
57.8%
Latin 72402523
42.2%

Most frequent character per script

Common
ValueCountFrequency (%)
- 15256396
15.4%
8 13115302
13.2%
3 10471017
10.6%
1 9256391
9.3%
6 8270441
8.3%
2 7804315
7.9%
4 7742634
7.8%
: 7628198
7.7%
7 6996246
7.1%
9 6388438
6.4%
Other values (2) 6302554
6.4%
Latin
ValueCountFrequency (%)
d 11592424
16.0%
u 11442297
15.8%
e 9689355
13.4%
c 9414596
13.0%
a 8567826
11.8%
f 6150105
8.5%
b 4103623
 
5.7%
r 3814099
 
5.3%
i 3814099
 
5.3%
n 3814099
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 171634455
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 15256396
 
8.9%
8 13115302
 
7.6%
d 11592424
 
6.8%
u 11442297
 
6.7%
3 10471017
 
6.1%
e 9689355
 
5.6%
c 9414596
 
5.5%
1 9256391
 
5.4%
a 8567826
 
5.0%
6 8270441
 
4.8%
Other values (12) 64558410
37.6%
Distinct43
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.1 MiB
2025-01-14T11:38:46.304700image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length4
Mean length3.026483319
Min length2

Characters and Unicode

Total characters11543307
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowUSNM
2nd rowUS
3rd rowUSNM
4th rowUSNM
5th rowUSNM
ValueCountFrequency (%)
usnm 1956033
51.3%
us 1856185
48.7%
nsmt 425
 
< 0.1%
uam 339
 
< 0.1%
rmnh 146
 
< 0.1%
nrm 137
 
< 0.1%
nmv 112
 
< 0.1%
rcs 95
 
< 0.1%
nmsz 77
 
< 0.1%
zmmu 70
 
< 0.1%
Other values (33) 480
 
< 0.1%
2025-01-14T11:38:46.433552image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 3812967
33.0%
U 3812714
33.0%
M 1957874
17.0%
N 1957126
17.0%
A 564
 
< 0.1%
T 425
 
< 0.1%
R 392
 
< 0.1%
H 233
 
< 0.1%
C 212
 
< 0.1%
Z 195
 
< 0.1%
Other values (11) 605
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 11543307
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 3812967
33.0%
U 3812714
33.0%
M 1957874
17.0%
N 1957126
17.0%
A 564
 
< 0.1%
T 425
 
< 0.1%
R 392
 
< 0.1%
H 233
 
< 0.1%
C 212
 
< 0.1%
Z 195
 
< 0.1%
Other values (11) 605
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 11543307
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 3812967
33.0%
U 3812714
33.0%
M 1957874
17.0%
N 1957126
17.0%
A 564
 
< 0.1%
T 425
 
< 0.1%
R 392
 
< 0.1%
H 233
 
< 0.1%
C 212
 
< 0.1%
Z 195
 
< 0.1%
Other values (11) 605
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11543307
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 3812967
33.0%
U 3812714
33.0%
M 1957874
17.0%
N 1957126
17.0%
A 564
 
< 0.1%
T 425
 
< 0.1%
R 392
 
< 0.1%
H 233
 
< 0.1%
C 212
 
< 0.1%
Z 195
 
< 0.1%
Other values (11) 605
 
< 0.1%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.1 MiB
2025-01-14T11:38:46.483457image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.608911043
Min length2

Characters and Unicode

Total characters9950645
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIZ
2nd rowUS
3rd rowHERP
4th rowIZ
5th rowIZ
ValueCountFrequency (%)
us 1856185
48.7%
iz 792284
20.8%
ent 249390
 
6.5%
mamm 247291
 
6.5%
birds 240577
 
6.3%
herp 240491
 
6.3%
fish 187881
 
4.9%
2025-01-14T11:38:46.589343image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 2284643
23.0%
U 1856185
18.7%
I 1220742
12.3%
Z 792284
 
8.0%
M 741873
 
7.5%
E 489881
 
4.9%
R 481068
 
4.8%
H 428372
 
4.3%
N 249390
 
2.5%
T 249390
 
2.5%
Other values (5) 1156817
11.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 9950645
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 2284643
23.0%
U 1856185
18.7%
I 1220742
12.3%
Z 792284
 
8.0%
M 741873
 
7.5%
E 489881
 
4.9%
R 481068
 
4.8%
H 428372
 
4.3%
N 249390
 
2.5%
T 249390
 
2.5%
Other values (5) 1156817
11.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 9950645
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 2284643
23.0%
U 1856185
18.7%
I 1220742
12.3%
Z 792284
 
8.0%
M 741873
 
7.5%
E 489881
 
4.9%
R 481068
 
4.8%
H 428372
 
4.3%
N 249390
 
2.5%
T 249390
 
2.5%
Other values (5) 1156817
11.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9950645
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 2284643
23.0%
U 1856185
18.7%
I 1220742
12.3%
Z 792284
 
8.0%
M 741873
 
7.5%
E 489881
 
4.9%
R 481068
 
4.8%
H 428372
 
4.3%
N 249390
 
2.5%
T 249390
 
2.5%
Other values (5) 1156817
11.6%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.1 MiB
2025-01-14T11:38:46.633323image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters72467881
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 3814099
33.3%
extant 3814099
33.3%
biology 3814099
33.3%
2025-01-14T11:38:46.728920image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 7628198
 
10.5%
7628198
 
10.5%
t 7628198
 
10.5%
o 7628198
 
10.5%
M 3814099
 
5.3%
H 3814099
 
5.3%
E 3814099
 
5.3%
x 3814099
 
5.3%
a 3814099
 
5.3%
n 3814099
 
5.3%
Other values (5) 19070495
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 41955089
57.9%
Uppercase Letter 22884594
31.6%
Space Separator 7628198
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 7628198
18.2%
o 7628198
18.2%
x 3814099
9.1%
a 3814099
9.1%
n 3814099
9.1%
i 3814099
9.1%
l 3814099
9.1%
g 3814099
9.1%
y 3814099
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 7628198
33.3%
M 3814099
16.7%
H 3814099
16.7%
E 3814099
16.7%
B 3814099
16.7%
Space Separator
ValueCountFrequency (%)
7628198
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 64839683
89.5%
Common 7628198
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 7628198
11.8%
t 7628198
11.8%
o 7628198
11.8%
M 3814099
 
5.9%
H 3814099
 
5.9%
E 3814099
 
5.9%
x 3814099
 
5.9%
a 3814099
 
5.9%
n 3814099
 
5.9%
B 3814099
 
5.9%
Other values (4) 15256396
23.5%
Common
ValueCountFrequency (%)
7628198
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 72467881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 7628198
 
10.5%
7628198
 
10.5%
t 7628198
 
10.5%
o 7628198
 
10.5%
M 3814099
 
5.3%
H 3814099
 
5.3%
E 3814099
 
5.3%
x 3814099
 
5.3%
a 3814099
 
5.3%
n 3814099
 
5.3%
Other values (5) 19070495
26.3%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.1 MiB
2025-01-14T11:38:46.781410image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length17.0061076
Min length16

Characters and Unicode

Total characters64862978
Distinct characters21
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPreservedSpecimen
2nd rowPreservedSpecimen
3rd rowPreservedSpecimen
4th rowPreservedSpecimen
5th rowPreservedSpecimen
ValueCountFrequency (%)
preservedspecimen 3763178
98.7%
machineobservation 37108
 
1.0%
humanobservation 13813
 
0.4%
2025-01-14T11:38:46.900529image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 18903919
29.1%
r 7577277
11.7%
n 3865020
 
6.0%
i 3851207
 
5.9%
s 3814099
 
5.9%
v 3814099
 
5.9%
c 3800286
 
5.9%
m 3776991
 
5.8%
P 3763178
 
5.8%
p 3763178
 
5.8%
Other values (11) 7933724
12.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 57234780
88.2%
Uppercase Letter 7628198
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 18903919
33.0%
r 7577277
13.2%
n 3865020
 
6.8%
i 3851207
 
6.7%
s 3814099
 
6.7%
v 3814099
 
6.7%
c 3800286
 
6.6%
m 3776991
 
6.6%
p 3763178
 
6.6%
d 3763178
 
6.6%
Other values (6) 305526
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
P 3763178
49.3%
S 3763178
49.3%
O 50921
 
0.7%
M 37108
 
0.5%
H 13813
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 64862978
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 18903919
29.1%
r 7577277
11.7%
n 3865020
 
6.0%
i 3851207
 
5.9%
s 3814099
 
5.9%
v 3814099
 
5.9%
c 3800286
 
5.9%
m 3776991
 
5.8%
P 3763178
 
5.8%
p 3763178
 
5.8%
Other values (11) 7933724
12.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64862978
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 18903919
29.1%
r 7577277
11.7%
n 3865020
 
6.0%
i 3851207
 
5.9%
s 3814099
 
5.9%
v 3814099
 
5.9%
c 3800286
 
5.9%
m 3776991
 
5.8%
P 3763178
 
5.8%
p 3763178
 
5.8%
Other values (11) 7933724
12.2%

occurrenceID
Text

Unique 

Distinct3814099
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size29.1 MiB
2025-01-14T11:38:48.657366image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters240288237
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3814099 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/3c1d5cd1b-23f9-4aab-8cd8-011e6535be18
2nd rowhttp://n2t.net/ark:/65665/38212d138-cfcd-4363-8d3b-93b82afc1d4b
3rd rowhttp://n2t.net/ark:/65665/3c1d69371-acc7-4c47-bc57-9d5ba7994267
4th rowhttp://n2t.net/ark:/65665/382140f93-30c1-4f26-bd0c-77d197d5ebc0
5th rowhttp://n2t.net/ark:/65665/3c1d814f8-bb57-4c37-a953-dd84b1c6415d
ValueCountFrequency (%)
http://n2t.net/ark:/65665/3c1d5cd1b-23f9-4aab-8cd8-011e6535be18 1
 
< 0.1%
http://n2t.net/ark:/65665/3c1dca2ff-5a4b-407d-be1e-8c2465e2dbc4 1
 
< 0.1%
http://n2t.net/ark:/65665/3c1eb4e39-5ffd-4448-b2ce-395313b0c10e 1
 
< 0.1%
http://n2t.net/ark:/65665/3c1ea3d60-dba8-415e-80b6-a9bc8c946ff8 1
 
< 0.1%
http://n2t.net/ark:/65665/3823c9e76-01df-419c-aa05-a6aec0f69473 1
 
< 0.1%
http://n2t.net/ark:/65665/382257af5-f81d-4f8a-aff0-9f0f328b0fdb 1
 
< 0.1%
http://n2t.net/ark:/65665/3c1d69371-acc7-4c47-bc57-9d5ba7994267 1
 
< 0.1%
http://n2t.net/ark:/65665/382140f93-30c1-4f26-bd0c-77d197d5ebc0 1
 
< 0.1%
http://n2t.net/ark:/65665/3c1d814f8-bb57-4c37-a953-dd84b1c6415d 1
 
< 0.1%
http://n2t.net/ark:/65665/38215186e-af4f-46dc-8b81-ec58617bdfd7 1
 
< 0.1%
Other values (3814089) 3814089
> 99.9%
2025-01-14T11:38:50.515512image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 19070495
 
7.9%
6 18595063
 
7.7%
- 15256396
 
6.3%
t 15256396
 
6.3%
5 14773602
 
6.1%
a 11917774
 
5.0%
2 10968294
 
4.6%
e 10967918
 
4.6%
3 10962313
 
4.6%
4 10959604
 
4.6%
Other values (16) 101560382
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 103921686
43.2%
Lowercase Letter 90597363
37.7%
Other Punctuation 30512792
 
12.7%
Dash Punctuation 15256396
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 15256396
16.8%
a 11917774
13.2%
e 10967918
12.1%
b 8105562
8.9%
n 7628198
8.4%
c 7157232
7.9%
d 7155364
7.9%
f 7152523
7.9%
k 3814099
 
4.2%
r 3814099
 
4.2%
Other values (2) 7628198
8.4%
Decimal Number
ValueCountFrequency (%)
6 18595063
17.9%
5 14773602
14.2%
2 10968294
10.6%
3 10962313
10.5%
4 10959604
10.5%
9 8108056
7.8%
8 8105434
7.8%
1 7152772
 
6.9%
0 7148472
 
6.9%
7 7148076
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 19070495
62.5%
: 7628198
 
25.0%
. 3814099
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 15256396
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 149690874
62.3%
Latin 90597363
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 19070495
12.7%
6 18595063
12.4%
- 15256396
10.2%
5 14773602
9.9%
2 10968294
7.3%
3 10962313
7.3%
4 10959604
7.3%
9 8108056
 
5.4%
8 8105434
 
5.4%
: 7628198
 
5.1%
Other values (4) 25263419
16.9%
Latin
ValueCountFrequency (%)
t 15256396
16.8%
a 11917774
13.2%
e 10967918
12.1%
b 8105562
8.9%
n 7628198
8.4%
c 7157232
7.9%
d 7155364
7.9%
f 7152523
7.9%
k 3814099
 
4.2%
r 3814099
 
4.2%
Other values (2) 7628198
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 240288237
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 19070495
 
7.9%
6 18595063
 
7.7%
- 15256396
 
6.3%
t 15256396
 
6.3%
5 14773602
 
6.1%
a 11917774
 
5.0%
2 10968294
 
4.6%
e 10967918
 
4.6%
3 10962313
 
4.6%
4 10959604
 
4.6%
Other values (16) 101560382
42.3%

catalogNumber
Text

Missing 

Distinct2680425
Distinct (%)77.3%
Missing344412
Missing (%)9.0%
Memory size29.1 MiB
2025-01-14T11:38:51.959993image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length21
Mean length10.5415526
Min length4

Characters and Unicode

Total characters36575888
Distinct characters69
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2220561 ?
Unique (%)64.0%

Sample

1st rowUSNM 1220020
2nd rowUS 2327562
3rd rowUSNM 359728
4th rowUSNM 65866
5th rowUSNM 1569732
ValueCountFrequency (%)
usnm 1706642
25.2%
us 1589980
23.5%
herp 2389
 
< 0.1%
tissue 2336
 
< 0.1%
sem 97
 
< 0.1%
69
 
< 0.1%
1 61
 
< 0.1%
stub 57
 
< 0.1%
image 53
 
< 0.1%
micrograph 40
 
< 0.1%
Other values (2298683) 3469782
51.2%
2025-01-14T11:38:53.607954image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 3476269
 
9.5%
U 3468327
 
9.5%
3301819
 
9.0%
1 2915757
 
8.0%
2 2640645
 
7.2%
3 2480659
 
6.8%
0 2116762
 
5.8%
4 2111189
 
5.8%
5 2073315
 
5.7%
N 2019199
 
5.5%
Other values (59) 9971947
27.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 21938221
60.0%
Uppercase Letter 11281252
30.8%
Space Separator 3301819
 
9.0%
Lowercase Letter 42269
 
0.1%
Dash Punctuation 9275
 
< 0.1%
Other Punctuation 3024
 
< 0.1%
Close Punctuation 14
 
< 0.1%
Open Punctuation 14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w 17734
42.0%
e 4857
 
11.5%
s 4672
 
11.1%
a 3519
 
8.3%
r 2470
 
5.8%
p 2432
 
5.8%
u 2407
 
5.7%
i 2385
 
5.6%
b 800
 
1.9%
c 291
 
0.7%
Other values (16) 702
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
S 3476269
30.8%
U 3468327
30.7%
N 2019199
17.9%
M 1873513
16.6%
E 181447
 
1.6%
T 162747
 
1.4%
A 27947
 
0.2%
D 27138
 
0.2%
R 18007
 
0.2%
B 14421
 
0.1%
Other values (15) 12237
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 2915757
13.3%
2 2640645
12.0%
3 2480659
11.3%
0 2116762
9.6%
4 2111189
9.6%
5 2073315
9.5%
6 1957916
8.9%
7 1917347
8.7%
8 1892550
8.6%
9 1832081
8.4%
Other Punctuation
ValueCountFrequency (%)
. 1723
57.0%
* 1295
42.8%
? 5
 
0.2%
' 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3301819
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9275
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 25252367
69.0%
Latin 11323521
31.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 3476269
30.7%
U 3468327
30.6%
N 2019199
17.8%
M 1873513
16.5%
E 181447
 
1.6%
T 162747
 
1.4%
A 27947
 
0.2%
D 27138
 
0.2%
R 18007
 
0.2%
w 17734
 
0.2%
Other values (41) 51193
 
0.5%
Common
ValueCountFrequency (%)
3301819
13.1%
1 2915757
11.5%
2 2640645
10.5%
3 2480659
9.8%
0 2116762
8.4%
4 2111189
8.4%
5 2073315
8.2%
6 1957916
7.8%
7 1917347
7.6%
8 1892550
7.5%
Other values (8) 1844408
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36575888
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 3476269
 
9.5%
U 3468327
 
9.5%
3301819
 
9.0%
1 2915757
 
8.0%
2 2640645
 
7.2%
3 2480659
 
6.8%
0 2116762
 
5.8%
4 2111189
 
5.8%
5 2073315
 
5.7%
N 2019199
 
5.5%
Other values (59) 9971947
27.3%

recordNumber
Text

Missing 

Distinct368960
Distinct (%)17.4%
Missing1688619
Missing (%)44.3%
Memory size29.1 MiB
2025-01-14T11:38:53.924179image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length93
Median length90
Mean length4.785350133
Min length1

Characters and Unicode

Total characters10171166
Distinct characters112
Distinct categories14 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique292732 ?
Unique (%)13.8%

Sample

1st row5209
2nd rowUSNPC # 008843
3rd rowUSNPC # 074963
4th row478
5th rows.n.
ValueCountFrequency (%)
s.n 264664
 
11.1%
41913
 
1.8%
usnpc 36535
 
1.5%
no 19873
 
0.8%
number 19484
 
0.8%
bureau 8434
 
0.4%
eyd 6470
 
0.3%
s 5865
 
0.2%
n 5665
 
0.2%
of 5647
 
0.2%
Other values (270340) 1961294
82.6%
2025-01-14T11:38:54.276432image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1157151
11.4%
2 898438
 
8.8%
3 773609
 
7.6%
0 742907
 
7.3%
4 724395
 
7.1%
5 695725
 
6.8%
6 673176
 
6.6%
7 636233
 
6.3%
8 610720
 
6.0%
9 594451
 
5.8%
Other values (102) 2664361
26.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7506805
73.8%
Lowercase Letter 877335
 
8.6%
Uppercase Letter 735091
 
7.2%
Other Punctuation 641780
 
6.3%
Space Separator 250364
 
2.5%
Dash Punctuation 145228
 
1.4%
Connector Punctuation 6142
 
0.1%
Close Punctuation 3742
 
< 0.1%
Open Punctuation 3741
 
< 0.1%
Other Number 651
 
< 0.1%
Other values (4) 287
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 283424
32.3%
s 277432
31.6%
e 48508
 
5.5%
u 39655
 
4.5%
r 39309
 
4.5%
o 35505
 
4.0%
a 30735
 
3.5%
b 29074
 
3.3%
m 21508
 
2.5%
c 16556
 
1.9%
Other values (26) 55629
 
6.3%
Uppercase Letter
ValueCountFrequency (%)
N 95367
13.0%
S 74319
 
10.1%
C 61194
 
8.3%
P 58289
 
7.9%
U 43654
 
5.9%
B 39926
 
5.4%
A 38358
 
5.2%
H 32338
 
4.4%
D 30688
 
4.2%
L 29714
 
4.0%
Other values (19) 231244
31.5%
Other Punctuation
ValueCountFrequency (%)
. 560765
87.4%
# 36852
 
5.7%
/ 19492
 
3.0%
& 10338
 
1.6%
* 5808
 
0.9%
? 4401
 
0.7%
, 2500
 
0.4%
! 976
 
0.2%
: 367
 
0.1%
; 177
 
< 0.1%
Other values (5) 104
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1157151
15.4%
2 898438
12.0%
3 773609
10.3%
0 742907
9.9%
4 724395
9.6%
5 695725
9.3%
6 673176
9.0%
7 636233
8.5%
8 610720
8.1%
9 594451
7.9%
Other Number
ValueCountFrequency (%)
½ 622
95.5%
² 10
 
1.5%
¼ 9
 
1.4%
¾ 4
 
0.6%
³ 3
 
0.5%
3
 
0.5%
Close Punctuation
ValueCountFrequency (%)
) 3462
92.5%
] 180
 
4.8%
} 100
 
2.7%
Open Punctuation
ValueCountFrequency (%)
( 3461
92.5%
[ 180
 
4.8%
{ 100
 
2.7%
Math Symbol
ValueCountFrequency (%)
= 214
75.9%
+ 66
 
23.4%
~ 2
 
0.7%
Dash Punctuation
ValueCountFrequency (%)
- 145227
> 99.9%
1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
250364
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 6142
100.0%
Currency Symbol
ValueCountFrequency (%)
¢ 3
100.0%
Other Symbol
ValueCountFrequency (%)
° 1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8558740
84.1%
Latin 1612425
 
15.9%
Greek 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 283424
17.6%
s 277432
17.2%
N 95367
 
5.9%
S 74319
 
4.6%
C 61194
 
3.8%
P 58289
 
3.6%
e 48508
 
3.0%
U 43654
 
2.7%
B 39926
 
2.5%
u 39655
 
2.5%
Other values (54) 590657
36.6%
Common
ValueCountFrequency (%)
1 1157151
13.5%
2 898438
10.5%
3 773609
9.0%
0 742907
8.7%
4 724395
8.5%
5 695725
8.1%
6 673176
7.9%
7 636233
7.4%
8 610720
7.1%
9 594451
6.9%
Other values (37) 1051935
12.3%
Greek
ValueCountFrequency (%)
Σ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10170472
> 99.9%
None 688
 
< 0.1%
Number Forms 3
 
< 0.1%
Punctuation 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1157151
11.4%
2 898438
 
8.8%
3 773609
 
7.6%
0 742907
 
7.3%
4 724395
 
7.1%
5 695725
 
6.8%
6 673176
 
6.6%
7 636233
 
6.3%
8 610720
 
6.0%
9 594451
 
5.8%
Other values (78) 2663667
26.2%
None
ValueCountFrequency (%)
½ 622
90.4%
è 13
 
1.9%
² 10
 
1.5%
¼ 9
 
1.3%
é 5
 
0.7%
á 4
 
0.6%
¾ 4
 
0.6%
³ 3
 
0.4%
ó 3
 
0.4%
¢ 3
 
0.4%
Other values (10) 12
 
1.7%
Number Forms
ValueCountFrequency (%)
3
100.0%
Punctuation
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

recordedBy
Text

Missing 

Distinct146306
Distinct (%)4.9%
Missing806049
Missing (%)21.1%
Memory size29.1 MiB
2025-01-14T11:38:54.482918image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length54675
Median length182
Mean length17.2249567
Min length1

Characters and Unicode

Total characters51813531
Distinct characters158
Distinct categories15 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique69904 ?
Unique (%)2.3%

Sample

1st rowG. Hendler
2nd rowR. C. Rollins & D. Rollins
3rd rowT. Vaughan
4th rowD. Harper
5th rowF. Harvey
ValueCountFrequency (%)
667749
 
6.3%
j 491479
 
4.7%
a 393168
 
3.7%
r 369249
 
3.5%
e 349861
 
3.3%
c 335018
 
3.2%
m 318954
 
3.0%
h 289384
 
2.7%
w 252512
 
2.4%
l 232486
 
2.2%
Other values (54376) 6868027
65.0%
2025-01-14T11:38:54.793593image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7557395
 
14.6%
. 4811467
 
9.3%
e 3584881
 
6.9%
a 2593680
 
5.0%
r 2503087
 
4.8%
n 2366124
 
4.6%
o 2354656
 
4.5%
i 2159658
 
4.2%
l 1869857
 
3.6%
t 1864639
 
3.6%
Other values (148) 20148087
38.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 27912312
53.9%
Uppercase Letter 10166557
 
19.6%
Space Separator 7557395
 
14.6%
Other Punctuation 5908704
 
11.4%
Dash Punctuation 168034
 
0.3%
Close Punctuation 35074
 
0.1%
Open Punctuation 35044
 
0.1%
Decimal Number 17237
 
< 0.1%
Control 13056
 
< 0.1%
Math Symbol 93
 
< 0.1%
Other values (5) 25
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3584881
12.8%
a 2593680
9.3%
r 2503087
9.0%
n 2366124
 
8.5%
o 2354656
 
8.4%
i 2159658
 
7.7%
l 1869857
 
6.7%
t 1864639
 
6.7%
s 1665441
 
6.0%
h 836236
 
3.0%
Other values (66) 6114053
21.9%
Uppercase Letter
ValueCountFrequency (%)
M 931299
 
9.2%
S 892293
 
8.8%
C 776035
 
7.6%
R 639220
 
6.3%
H 636722
 
6.3%
B 616217
 
6.1%
J 590544
 
5.8%
A 579459
 
5.7%
L 545623
 
5.4%
W 487694
 
4.8%
Other values (34) 3471451
34.1%
Other Punctuation
ValueCountFrequency (%)
. 4811467
81.4%
& 594583
 
10.1%
, 394712
 
6.7%
/ 99770
 
1.7%
' 6577
 
0.1%
: 762
 
< 0.1%
" 711
 
< 0.1%
? 81
 
< 0.1%
; 32
 
< 0.1%
# 6
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 3448
20.0%
9 2478
14.4%
8 2316
13.4%
0 2065
12.0%
2 1533
8.9%
3 1425
8.3%
4 1320
 
7.7%
5 1138
 
6.6%
6 815
 
4.7%
7 699
 
4.1%
Control
ValueCountFrequency (%)
12986
99.5%
68
 
0.5%
 1
 
< 0.1%
 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 27046
77.2%
( 7998
 
22.8%
Close Punctuation
ValueCountFrequency (%)
] 27044
77.1%
) 8030
 
22.9%
Math Symbol
ValueCountFrequency (%)
= 84
90.3%
+ 9
 
9.7%
Space Separator
ValueCountFrequency (%)
7557395
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 168034
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 16
100.0%
Other Symbol
ValueCountFrequency (%)
° 6
100.0%
Initial Punctuation
ValueCountFrequency (%)
« 1
100.0%
Final Punctuation
ValueCountFrequency (%)
» 1
100.0%
Currency Symbol
ValueCountFrequency (%)
¢ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 38078867
73.5%
Common 13734662
 
26.5%
Greek 1
 
< 0.1%
Cyrillic 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3584881
 
9.4%
a 2593680
 
6.8%
r 2503087
 
6.6%
n 2366124
 
6.2%
o 2354656
 
6.2%
i 2159658
 
5.7%
l 1869857
 
4.9%
t 1864639
 
4.9%
s 1665441
 
4.4%
M 931299
 
2.4%
Other values (108) 16185545
42.5%
Common
ValueCountFrequency (%)
7557395
55.0%
. 4811467
35.0%
& 594583
 
4.3%
, 394712
 
2.9%
- 168034
 
1.2%
/ 99770
 
0.7%
[ 27046
 
0.2%
] 27044
 
0.2%
12986
 
0.1%
) 8030
 
0.1%
Other values (28) 33595
 
0.2%
Greek
ValueCountFrequency (%)
β 1
100.0%
Cyrillic
ValueCountFrequency (%)
Ӧ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 51710194
99.8%
None 103334
 
0.2%
IPA Ext 2
 
< 0.1%
Cyrillic 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7557395
 
14.6%
. 4811467
 
9.3%
e 3584881
 
6.9%
a 2593680
 
5.0%
r 2503087
 
4.8%
n 2366124
 
4.6%
o 2354656
 
4.6%
i 2159658
 
4.2%
l 1869857
 
3.6%
t 1864639
 
3.6%
Other values (72) 20044750
38.8%
None
ValueCountFrequency (%)
á 17609
17.0%
é 17464
16.9%
ó 16022
15.5%
í 11942
11.6%
ñ 10358
10.0%
è 7115
6.9%
ü 5658
 
5.5%
ö 4415
 
4.3%
ê 2853
 
2.8%
ç 1312
 
1.3%
Other values (64) 8586
8.3%
IPA Ext
ValueCountFrequency (%)
ɶ 2
100.0%
Cyrillic
ValueCountFrequency (%)
Ӧ 1
100.0%
Distinct968
Distinct (%)< 0.1%
Missing1634
Missing (%)< 0.1%
Memory size29.1 MiB
2025-01-14T11:38:55.035853image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length1
Mean length1.031925014
Min length1

Characters and Unicode

Total characters3934178
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique362 ?
Unique (%)< 0.1%

Sample

1st row31
2nd row1
3rd row1
4th row4
5th row1
ValueCountFrequency (%)
1 3306583
86.7%
2 152239
 
4.0%
3 74520
 
2.0%
4 53383
 
1.4%
5 38629
 
1.0%
6 27032
 
0.7%
10 19695
 
0.5%
7 17153
 
0.4%
8 15778
 
0.4%
9 10348
 
0.3%
Other values (958) 97105
 
2.5%
2025-01-14T11:38:55.383612image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3383057
86.0%
2 185717
 
4.7%
3 91639
 
2.3%
4 66277
 
1.7%
5 59210
 
1.5%
0 50050
 
1.3%
6 35555
 
0.9%
7 24584
 
0.6%
8 22275
 
0.6%
9 15814
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3934178
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3383057
86.0%
2 185717
 
4.7%
3 91639
 
2.3%
4 66277
 
1.7%
5 59210
 
1.5%
0 50050
 
1.3%
6 35555
 
0.9%
7 24584
 
0.6%
8 22275
 
0.6%
9 15814
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Common 3934178
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 3383057
86.0%
2 185717
 
4.7%
3 91639
 
2.3%
4 66277
 
1.7%
5 59210
 
1.5%
0 50050
 
1.3%
6 35555
 
0.9%
7 24584
 
0.6%
8 22275
 
0.6%
9 15814
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3934178
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3383057
86.0%
2 185717
 
4.7%
3 91639
 
2.3%
4 66277
 
1.7%
5 59210
 
1.5%
0 50050
 
1.3%
6 35555
 
0.9%
7 24584
 
0.6%
8 22275
 
0.6%
9 15814
 
0.4%

sex
Text

Missing 

Distinct266
Distinct (%)< 0.1%
Missing3118857
Missing (%)81.8%
Memory size29.1 MiB
2025-01-14T11:38:55.446735image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length76
Median length75
Mean length5.596934593
Min length1

Characters and Unicode

Total characters3891224
Distinct characters36
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique110 ?
Unique (%)< 0.1%

Sample

1st rowunknown
2nd rowFemale
3rd rowMale
4th rowMale
5th rowMale
ValueCountFrequency (%)
male 343203
46.8%
female 285482
38.9%
unknown 98697
 
13.5%
worker 2922
 
0.4%
sex 1719
 
0.2%
731
 
0.1%
hermaphrodite 126
 
< 0.1%
multiple 119
 
< 0.1%
animals 119
 
< 0.1%
of 119
 
< 0.1%
Other values (15) 560
 
0.1%
2025-01-14T11:38:55.605422image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 919476
23.6%
a 628885
16.2%
l 628877
16.2%
m 340648
 
8.8%
n 296493
 
7.6%
M 288718
 
7.4%
F 227282
 
5.8%
o 101977
 
2.6%
k 101620
 
2.6%
w 98773
 
2.5%
Other values (26) 258475
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3222263
82.8%
Uppercase Letter 584061
 
15.0%
Other Punctuation 46345
 
1.2%
Space Separator 38555
 
1.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 919476
28.5%
a 628885
19.5%
l 628877
19.5%
m 340648
 
10.6%
n 296493
 
9.2%
o 101977
 
3.2%
k 101620
 
3.2%
w 98773
 
3.1%
f 58364
 
1.8%
u 36551
 
1.1%
Other values (10) 10599
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
M 288718
49.4%
F 227282
38.9%
U 62378
 
10.7%
W 2848
 
0.5%
S 1600
 
0.3%
E 508
 
0.1%
L 362
 
0.1%
A 362
 
0.1%
I 2
 
< 0.1%
P 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
; 45763
98.7%
& 490
 
1.1%
? 48
 
0.1%
/ 42
 
0.1%
, 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
38555
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3806324
97.8%
Common 84900
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 919476
24.2%
a 628885
16.5%
l 628877
16.5%
m 340648
 
8.9%
n 296493
 
7.8%
M 288718
 
7.6%
F 227282
 
6.0%
o 101977
 
2.7%
k 101620
 
2.7%
w 98773
 
2.6%
Other values (20) 173575
 
4.6%
Common
ValueCountFrequency (%)
; 45763
53.9%
38555
45.4%
& 490
 
0.6%
? 48
 
0.1%
/ 42
 
< 0.1%
, 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3891224
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 919476
23.6%
a 628885
16.2%
l 628877
16.2%
m 340648
 
8.8%
n 296493
 
7.6%
M 288718
 
7.4%
F 227282
 
5.8%
o 101977
 
2.6%
k 101620
 
2.6%
w 98773
 
2.5%
Other values (26) 258475
 
6.6%

lifeStage
Text

Missing 

Distinct1019
Distinct (%)0.2%
Missing3359653
Missing (%)88.1%
Memory size29.1 MiB
2025-01-14T11:38:55.793406image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length80
Median length5
Mean length7.432049132
Min length1

Characters and Unicode

Total characters3377465
Distinct characters73
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique393 ?
Unique (%)0.1%

Sample

1st rowAdult
2nd rowAdult
3rd rowAdult
4th rowFruiting
5th rowphyllosoma VII
ValueCountFrequency (%)
adult 229330
44.1%
flowering 95850
18.4%
fruiting 41812
 
8.0%
juvenile 34834
 
6.7%
and 17801
 
3.4%
immature 16677
 
3.2%
vegetative 9757
 
1.9%
fertile 7533
 
1.4%
7209
 
1.4%
ovigerous 6452
 
1.2%
Other values (344) 52858
 
10.2%
2025-01-14T11:38:56.175715image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 383349
11.4%
u 343495
10.2%
t 328549
9.7%
e 256052
 
7.6%
d 254357
 
7.5%
i 250388
 
7.4%
A 209190
 
6.2%
n 199864
 
5.9%
r 192680
 
5.7%
g 159378
 
4.7%
Other values (63) 800163
23.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2841564
84.1%
Uppercase Letter 436961
 
12.9%
Space Separator 65667
 
1.9%
Other Punctuation 32873
 
1.0%
Dash Punctuation 157
 
< 0.1%
Decimal Number 139
 
< 0.1%
Open Punctuation 52
 
< 0.1%
Close Punctuation 52
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 383349
13.5%
u 343495
12.1%
t 328549
11.6%
e 256052
9.0%
d 254357
9.0%
i 250388
8.8%
n 199864
7.0%
r 192680
6.8%
g 159378
5.6%
o 120863
 
4.3%
Other values (17) 352589
12.4%
Uppercase Letter
ValueCountFrequency (%)
A 209190
47.9%
F 147747
33.8%
I 33460
 
7.7%
J 17075
 
3.9%
V 9929
 
2.3%
L 5147
 
1.2%
S 3481
 
0.8%
W 1839
 
0.4%
E 1600
 
0.4%
C 1592
 
0.4%
Other values (15) 5901
 
1.4%
Decimal Number
ValueCountFrequency (%)
1 58
41.7%
2 39
28.1%
3 17
 
12.2%
4 14
 
10.1%
5 8
 
5.8%
8 1
 
0.7%
9 1
 
0.7%
6 1
 
0.7%
Other Punctuation
ValueCountFrequency (%)
; 32549
99.0%
? 175
 
0.5%
& 81
 
0.2%
/ 28
 
0.1%
, 19
 
0.1%
. 12
 
< 0.1%
' 9
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 45
86.5%
[ 7
 
13.5%
Close Punctuation
ValueCountFrequency (%)
) 45
86.5%
] 7
 
13.5%
Space Separator
ValueCountFrequency (%)
65667
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 157
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3278525
97.1%
Common 98940
 
2.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 383349
11.7%
u 343495
10.5%
t 328549
10.0%
e 256052
 
7.8%
d 254357
 
7.8%
i 250388
 
7.6%
A 209190
 
6.4%
n 199864
 
6.1%
r 192680
 
5.9%
g 159378
 
4.9%
Other values (42) 701223
21.4%
Common
ValueCountFrequency (%)
65667
66.4%
; 32549
32.9%
? 175
 
0.2%
- 157
 
0.2%
& 81
 
0.1%
1 58
 
0.1%
( 45
 
< 0.1%
) 45
 
< 0.1%
2 39
 
< 0.1%
/ 28
 
< 0.1%
Other values (11) 96
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3377452
> 99.9%
None 13
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 383349
11.4%
u 343495
10.2%
t 328549
9.7%
e 256052
 
7.6%
d 254357
 
7.5%
i 250388
 
7.4%
A 209190
 
6.2%
n 199864
 
5.9%
r 192680
 
5.7%
g 159378
 
4.7%
Other values (61) 800150
23.7%
None
ValueCountFrequency (%)
ü 9
69.2%
í 4
30.8%

reproductiveCondition
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:38:56.277772image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length58
Median length58
Mean length58
Min length58

Characters and Unicode

Total characters58
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowAnimalia, Chordata, Vertebrata, Amphibia, Anura, Bufonidae
ValueCountFrequency (%)
animalia 1
16.7%
chordata 1
16.7%
vertebrata 1
16.7%
amphibia 1
16.7%
anura 1
16.7%
bufonidae 1
16.7%
2025-01-14T11:38:56.442486image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 9
15.5%
i 5
 
8.6%
, 5
 
8.6%
5
 
8.6%
r 4
 
6.9%
A 3
 
5.2%
e 3
 
5.2%
t 3
 
5.2%
n 3
 
5.2%
d 2
 
3.4%
Other values (11) 16
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 42
72.4%
Uppercase Letter 6
 
10.3%
Other Punctuation 5
 
8.6%
Space Separator 5
 
8.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 9
21.4%
i 5
11.9%
r 4
9.5%
e 3
 
7.1%
t 3
 
7.1%
n 3
 
7.1%
d 2
 
4.8%
u 2
 
4.8%
b 2
 
4.8%
o 2
 
4.8%
Other values (5) 7
16.7%
Uppercase Letter
ValueCountFrequency (%)
A 3
50.0%
V 1
 
16.7%
C 1
 
16.7%
B 1
 
16.7%
Other Punctuation
ValueCountFrequency (%)
, 5
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 48
82.8%
Common 10
 
17.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 9
18.8%
i 5
10.4%
r 4
 
8.3%
A 3
 
6.2%
e 3
 
6.2%
t 3
 
6.2%
n 3
 
6.2%
d 2
 
4.2%
u 2
 
4.2%
b 2
 
4.2%
Other values (9) 12
25.0%
Common
ValueCountFrequency (%)
, 5
50.0%
5
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 58
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 9
15.5%
i 5
 
8.6%
, 5
 
8.6%
5
 
8.6%
r 4
 
6.9%
A 3
 
5.2%
e 3
 
5.2%
t 3
 
5.2%
n 3
 
5.2%
d 2
 
3.4%
Other values (11) 16
27.6%

caste
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:38:56.489213image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowAnimalia
ValueCountFrequency (%)
animalia 1
100.0%
2025-01-14T11:38:56.600488image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
87.5%
Uppercase Letter 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2
28.6%
a 2
28.6%
n 1
14.3%
m 1
14.3%
l 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

behavior
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:38:56.652067image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowChordata
ValueCountFrequency (%)
chordata 1
100.0%
2025-01-14T11:38:56.760993image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
25.0%
C 1
12.5%
h 1
12.5%
o 1
12.5%
r 1
12.5%
d 1
12.5%
t 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
87.5%
Uppercase Letter 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
28.6%
h 1
14.3%
o 1
14.3%
r 1
14.3%
d 1
14.3%
t 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
C 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
25.0%
C 1
12.5%
h 1
12.5%
o 1
12.5%
r 1
12.5%
d 1
12.5%
t 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
25.0%
C 1
12.5%
h 1
12.5%
o 1
12.5%
r 1
12.5%
d 1
12.5%
t 1
12.5%

vitality
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:38:56.812079image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowAmphibia
ValueCountFrequency (%)
amphibia 1
100.0%
2025-01-14T11:38:56.943103image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2
25.0%
A 1
12.5%
m 1
12.5%
p 1
12.5%
h 1
12.5%
b 1
12.5%
a 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
87.5%
Uppercase Letter 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2
28.6%
m 1
14.3%
p 1
14.3%
h 1
14.3%
b 1
14.3%
a 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 2
25.0%
A 1
12.5%
m 1
12.5%
p 1
12.5%
h 1
12.5%
b 1
12.5%
a 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2
25.0%
A 1
12.5%
m 1
12.5%
p 1
12.5%
h 1
12.5%
b 1
12.5%
a 1
12.5%

establishmentMeans
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:38:57.015720image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowAnura
ValueCountFrequency (%)
anura 1
100.0%
2025-01-14T11:38:57.156110image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1
20.0%
n 1
20.0%
u 1
20.0%
r 1
20.0%
a 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4
80.0%
Uppercase Letter 1
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 1
25.0%
u 1
25.0%
r 1
25.0%
a 1
25.0%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1
20.0%
n 1
20.0%
u 1
20.0%
r 1
20.0%
a 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1
20.0%
n 1
20.0%
u 1
20.0%
r 1
20.0%
a 1
20.0%

pathway
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:38:57.250209image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowBufonidae
ValueCountFrequency (%)
bufonidae 1
100.0%
2025-01-14T11:38:57.506806image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
B 1
11.1%
u 1
11.1%
f 1
11.1%
o 1
11.1%
n 1
11.1%
i 1
11.1%
d 1
11.1%
a 1
11.1%
e 1
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8
88.9%
Uppercase Letter 1
 
11.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 1
12.5%
f 1
12.5%
o 1
12.5%
n 1
12.5%
i 1
12.5%
d 1
12.5%
a 1
12.5%
e 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
B 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 1
11.1%
u 1
11.1%
f 1
11.1%
o 1
11.1%
n 1
11.1%
i 1
11.1%
d 1
11.1%
a 1
11.1%
e 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 1
11.1%
u 1
11.1%
f 1
11.1%
o 1
11.1%
n 1
11.1%
i 1
11.1%
d 1
11.1%
a 1
11.1%
e 1
11.1%

preparations
Text

Missing 

Distinct1356
Distinct (%)0.1%
Missing1975286
Missing (%)51.8%
Memory size29.1 MiB
2025-01-14T11:38:57.807471image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length192
Median length157
Mean length9.648374794
Min length1

Characters and Unicode

Total characters17741557
Distinct characters74
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique545 ?
Unique (%)< 0.1%

Sample

1st rowAlcohol (Ethanol)
2nd rowEthanol
3rd rowDry
4th rowAlcohol (Ethanol)
5th rowPinned
ValueCountFrequency (%)
ethanol 603904
21.8%
dry 379221
13.7%
alcohol 369929
13.4%
skin 344604
12.4%
whole 220441
 
8.0%
skull 185981
 
6.7%
pinned 160804
 
5.8%
slide 80393
 
2.9%
fluid 55206
 
2.0%
envelope 47332
 
1.7%
Other values (251) 322339
11.6%
2025-01-14T11:38:58.175718image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 2282490
 
12.9%
o 1831241
 
10.3%
n 1442452
 
8.1%
h 1247138
 
7.0%
931341
 
5.2%
i 795533
 
4.5%
e 770760
 
4.3%
a 765971
 
4.3%
t 743634
 
4.2%
S 684993
 
3.9%
Other values (64) 6246004
35.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12811436
72.2%
Uppercase Letter 2720642
 
15.3%
Space Separator 931341
 
5.2%
Other Punctuation 461552
 
2.6%
Open Punctuation 396222
 
2.2%
Close Punctuation 396222
 
2.2%
Decimal Number 16200
 
0.1%
Dash Punctuation 7942
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 2282490
17.8%
o 1831241
14.3%
n 1442452
11.3%
h 1247138
9.7%
i 795533
 
6.2%
e 770760
 
6.0%
a 765971
 
6.0%
t 743634
 
5.8%
k 580419
 
4.5%
r 510603
 
4.0%
Other values (16) 1841195
14.4%
Uppercase Letter
ValueCountFrequency (%)
S 684993
25.2%
E 669745
24.6%
D 380446
14.0%
A 375707
13.8%
W 238393
 
8.8%
P 194234
 
7.1%
F 72380
 
2.7%
M 25443
 
0.9%
B 12825
 
0.5%
L 10936
 
0.4%
Other values (15) 55540
 
2.0%
Decimal Number
ValueCountFrequency (%)
9 7489
46.2%
5 7357
45.4%
0 743
 
4.6%
8 412
 
2.5%
7 165
 
1.0%
1 16
 
0.1%
2 15
 
0.1%
3 2
 
< 0.1%
6 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
: 231665
50.2%
; 218925
47.4%
% 8082
 
1.8%
& 1368
 
0.3%
/ 1345
 
0.3%
. 105
 
< 0.1%
, 59
 
< 0.1%
? 3
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 394620
99.6%
[ 1602
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 394620
99.6%
] 1602
 
0.4%
Space Separator
ValueCountFrequency (%)
931341
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7942
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15532078
87.5%
Common 2209479
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 2282490
14.7%
o 1831241
11.8%
n 1442452
 
9.3%
h 1247138
 
8.0%
i 795533
 
5.1%
e 770760
 
5.0%
a 765971
 
4.9%
t 743634
 
4.8%
S 684993
 
4.4%
E 669745
 
4.3%
Other values (41) 4298121
27.7%
Common
ValueCountFrequency (%)
931341
42.2%
( 394620
17.9%
) 394620
17.9%
: 231665
 
10.5%
; 218925
 
9.9%
% 8082
 
0.4%
- 7942
 
0.4%
9 7489
 
0.3%
5 7357
 
0.3%
] 1602
 
0.1%
Other values (13) 5836
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17741557
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 2282490
 
12.9%
o 1831241
 
10.3%
n 1442452
 
8.1%
h 1247138
 
7.0%
931341
 
5.2%
i 795533
 
4.5%
e 770760
 
4.3%
a 765971
 
4.3%
t 743634
 
4.2%
S 684993
 
3.9%
Other values (64) 6246004
35.2%

disposition
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:38:58.257494image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowRhinella
ValueCountFrequency (%)
rhinella 1
100.0%
2025-01-14T11:38:58.400185image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 2
25.0%
R 1
12.5%
h 1
12.5%
i 1
12.5%
n 1
12.5%
e 1
12.5%
a 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
87.5%
Uppercase Letter 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 2
28.6%
h 1
14.3%
i 1
14.3%
n 1
14.3%
e 1
14.3%
a 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
R 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 2
25.0%
R 1
12.5%
h 1
12.5%
i 1
12.5%
n 1
12.5%
e 1
12.5%
a 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 2
25.0%
R 1
12.5%
h 1
12.5%
i 1
12.5%
n 1
12.5%
e 1
12.5%
a 1
12.5%

associatedMedia
Text

Missing 

Distinct2007097
Distinct (%)83.0%
Missing1396847
Missing (%)36.6%
Memory size29.1 MiB
2025-01-14T11:38:59.787644image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1040
Median length49
Mean length50.09518329
Min length42

Characters and Unicode

Total characters121092682
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1946069 ?
Unique (%)80.5%

Sample

1st rowhttps://collections.nmnh.si.edu/media/?i=14071815
2nd rowhttps://collections.nmnh.si.edu/media/?i=15812604
3rd rowhttps://collections.nmnh.si.edu/media/?i=16381603
4th rowhttps://collections.nmnh.si.edu/media/?i=15690882
5th rowhttps://collections.nmnh.si.edu/media/?i=14020520
ValueCountFrequency (%)
14558510 1287
 
< 0.1%
14894714 1283
 
< 0.1%
14888503 1224
 
< 0.1%
14888504 881
 
< 0.1%
5000376 839
 
< 0.1%
5000375 839
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=10674432 657
 
< 0.1%
15777181 615
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=10689696 591
 
< 0.1%
15596573 565
 
< 0.1%
Other values (2238001) 2687369
99.7%
2025-01-14T11:39:01.290788image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 9669008
 
8.0%
i 9669008
 
8.0%
s 7251756
 
6.0%
e 7251756
 
6.0%
n 7251756
 
6.0%
. 7251756
 
6.0%
t 7251756
 
6.0%
h 4834504
 
4.0%
c 4834504
 
4.0%
o 4834504
 
4.0%
Other values (21) 50992374
42.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 74934812
61.9%
Other Punctuation 22034168
 
18.2%
Decimal Number 21427552
 
17.7%
Math Symbol 2417252
 
2.0%
Space Separator 278898
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 9669008
12.9%
s 7251756
9.7%
e 7251756
9.7%
n 7251756
9.7%
t 7251756
9.7%
h 4834504
 
6.5%
c 4834504
 
6.5%
o 4834504
 
6.5%
l 4834504
 
6.5%
m 4834504
 
6.5%
Other values (4) 12086260
16.1%
Decimal Number
ValueCountFrequency (%)
1 4679824
21.8%
5 2199813
10.3%
4 2170502
10.1%
3 1974185
9.2%
2 1952018
9.1%
0 1830529
 
8.5%
6 1794059
 
8.4%
8 1719074
 
8.0%
7 1561962
 
7.3%
9 1545586
 
7.2%
Other Punctuation
ValueCountFrequency (%)
/ 9669008
43.9%
. 7251756
32.9%
? 2417252
 
11.0%
: 2417252
 
11.0%
; 278900
 
1.3%
Math Symbol
ValueCountFrequency (%)
= 2417252
100.0%
Space Separator
ValueCountFrequency (%)
278898
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 74934812
61.9%
Common 46157870
38.1%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 9669008
20.9%
. 7251756
15.7%
1 4679824
10.1%
? 2417252
 
5.2%
= 2417252
 
5.2%
: 2417252
 
5.2%
5 2199813
 
4.8%
4 2170502
 
4.7%
3 1974185
 
4.3%
2 1952018
 
4.2%
Other values (7) 9009008
19.5%
Latin
ValueCountFrequency (%)
i 9669008
12.9%
s 7251756
9.7%
e 7251756
9.7%
n 7251756
9.7%
t 7251756
9.7%
h 4834504
 
6.5%
c 4834504
 
6.5%
o 4834504
 
6.5%
l 4834504
 
6.5%
m 4834504
 
6.5%
Other values (4) 12086260
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 121092682
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 9669008
 
8.0%
i 9669008
 
8.0%
s 7251756
 
6.0%
e 7251756
 
6.0%
n 7251756
 
6.0%
. 7251756
 
6.0%
t 7251756
 
6.0%
h 4834504
 
4.0%
c 4834504
 
4.0%
o 4834504
 
4.0%
Other values (21) 50992374
42.1%

associatedSequences
Text

Missing 

Distinct5043
Distinct (%)99.4%
Missing3809026
Missing (%)99.9%
Memory size29.1 MiB
2025-01-14T11:39:01.373770image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12558
Median length49
Mean length104.1133452
Min length21

Characters and Unicode

Total characters528167
Distinct characters67
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5032 ?
Unique (%)99.2%

Sample

1st rowhttps://www.ncbi.nlm.nih.gov/gquery?term=KM080038
2nd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=EU823242|https://www.ncbi.nlm.nih.gov/gquery?term=EU823167|https://www.ncbi.nlm.nih.gov/gquery?term=KC246618
3rd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MN549733
4th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=KC771789|https://www.ncbi.nlm.nih.gov/gquery?term=KC771632
5th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=HQ600894
ValueCountFrequency (%)
https://www.ncbi.nlm.nih.gov/gquery?term=prjna521985 12
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=km521547 8
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=ay273864 4
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=ay273835 3
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=ay273832 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=fj207364 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=kf989555|https://www.ncbi.nlm.nih.gov/gquery?term=kf989872|https://www.ncbi.nlm.nih.gov/gquery?term=kf989774|https://www.ncbi.nlm.nih.gov/gquery?term=kf989974|https://www.ncbi.nlm.nih.gov/gquery?term=kf989663 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=mh244118 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=kp739770 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=jn837192|https://www.ncbi.nlm.nih.gov/gquery?term=jn837282|https://www.ncbi.nlm.nih.gov/gquery?term=jn837372|https://www.ncbi.nlm.nih.gov/gquery?term=jn837475 2
 
< 0.1%
Other values (5034) 5035
99.2%
2025-01-14T11:39:01.543768image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 42512
 
8.0%
t 31870
 
6.0%
/ 31869
 
6.0%
w 31869
 
6.0%
n 31869
 
6.0%
r 21250
 
4.0%
i 21248
 
4.0%
g 21248
 
4.0%
e 21247
 
4.0%
m 21247
 
4.0%
Other values (57) 251938
47.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 329331
62.4%
Other Punctuation 95629
 
18.1%
Decimal Number 64584
 
12.2%
Uppercase Letter 22264
 
4.2%
Math Symbol 16174
 
3.1%
Dash Punctuation 183
 
< 0.1%
Space Separator 1
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
K 4071
18.3%
M 2718
12.2%
J 2341
10.5%
U 1725
 
7.7%
Q 1614
 
7.2%
F 1316
 
5.9%
E 865
 
3.9%
R 815
 
3.7%
T 732
 
3.3%
W 728
 
3.3%
Other values (16) 5339
24.0%
Lowercase Letter
ValueCountFrequency (%)
t 31870
 
9.7%
w 31869
 
9.7%
n 31869
 
9.7%
r 21250
 
6.5%
i 21248
 
6.5%
g 21248
 
6.5%
e 21247
 
6.5%
m 21247
 
6.5%
h 21246
 
6.5%
o 10624
 
3.2%
Other values (11) 95613
29.0%
Decimal Number
ValueCountFrequency (%)
7 7494
11.6%
2 7086
11.0%
4 6564
10.2%
8 6478
10.0%
1 6347
9.8%
9 6334
9.8%
6 6120
9.5%
3 6086
9.4%
0 6082
9.4%
5 5993
9.3%
Other Punctuation
ValueCountFrequency (%)
. 42512
44.5%
/ 31869
33.3%
: 10623
 
11.1%
? 10623
 
11.1%
" 2
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 10623
65.7%
| 5551
34.3%
Dash Punctuation
ValueCountFrequency (%)
- 183
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 351595
66.6%
Common 176572
33.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 31870
 
9.1%
w 31869
 
9.1%
n 31869
 
9.1%
r 21250
 
6.0%
i 21248
 
6.0%
g 21248
 
6.0%
e 21247
 
6.0%
m 21247
 
6.0%
h 21246
 
6.0%
o 10624
 
3.0%
Other values (37) 117877
33.5%
Common
ValueCountFrequency (%)
. 42512
24.1%
/ 31869
18.0%
: 10623
 
6.0%
? 10623
 
6.0%
= 10623
 
6.0%
7 7494
 
4.2%
2 7086
 
4.0%
4 6564
 
3.7%
8 6478
 
3.7%
1 6347
 
3.6%
Other values (10) 36353
20.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 528167
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 42512
 
8.0%
t 31870
 
6.0%
/ 31869
 
6.0%
w 31869
 
6.0%
n 31869
 
6.0%
r 21250
 
4.0%
i 21248
 
4.0%
g 21248
 
4.0%
e 21247
 
4.0%
m 21247
 
4.0%
Other values (57) 251938
47.7%

occurrenceRemarks
Text

Missing 

Distinct253750
Distinct (%)50.0%
Missing3306658
Missing (%)86.7%
Memory size29.1 MiB
2025-01-14T11:39:01.845687image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length233869
Median length3014
Mean length66.88194293
Min length1

Characters and Unicode

Total characters33938640
Distinct characters173
Distinct categories20 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique219585 ?
Unique (%)43.3%

Sample

1st rowNinoe sp. B
2nd row{"hostGen":"Wallago","hostSpec":"after","hostBodyLoc":"stomach"}; Original USNPC preservative was a solution of 70% ethanol, 3% formalin, and 2% glycerine
3rd row{"hostGen":"Catoptrophorus","hostSpec":"semipalmatus","hostBodyLoc":"esophagus","hostFldNo":"JEBadley-426-23"}; Glycerin jelly
4th rowScripps Institution of Oceanography library archives about M.J. Johnson Phyllosoma Collection: specimens were stained with fast green and are mounted mostly in Canada balsam, Harleco synthetic resin or diatex.
5th row8/28/28; 6527; Orcutt; Chamberlain Coll
ValueCountFrequency (%)
of 102829
 
2.1%
by 78963
 
1.6%
and 73475
 
1.5%
the 71216
 
1.5%
coll 62126
 
1.3%
56730
 
1.2%
a 55849
 
1.1%
to 50755
 
1.0%
was 43759
 
0.9%
in 42473
 
0.9%
Other values (208280) 4265119
87.0%
2025-01-14T11:39:02.217758image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4357921
 
12.8%
e 2347617
 
6.9%
o 1831790
 
5.4%
a 1822117
 
5.4%
i 1658183
 
4.9%
t 1581641
 
4.7%
n 1541699
 
4.5%
r 1401352
 
4.1%
s 1326121
 
3.9%
l 1310982
 
3.9%
Other values (163) 14759217
43.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20844026
61.4%
Space Separator 4357921
 
12.8%
Uppercase Letter 3347998
 
9.9%
Other Punctuation 2602178
 
7.7%
Decimal Number 2132975
 
6.3%
Control 203815
 
0.6%
Dash Punctuation 172638
 
0.5%
Open Punctuation 123338
 
0.4%
Close Punctuation 123285
 
0.4%
Math Symbol 23926
 
0.1%
Other values (10) 6540
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2347617
11.3%
o 1831790
 
8.8%
a 1822117
 
8.7%
i 1658183
 
8.0%
t 1581641
 
7.6%
n 1541699
 
7.4%
r 1401352
 
6.7%
s 1326121
 
6.4%
l 1310982
 
6.3%
d 887403
 
4.3%
Other values (53) 5135121
24.6%
Uppercase Letter
ValueCountFrequency (%)
S 393212
 
11.7%
C 350050
 
10.5%
P 208690
 
6.2%
B 182921
 
5.5%
N 177358
 
5.3%
M 176100
 
5.3%
F 172695
 
5.2%
T 155295
 
4.6%
A 149808
 
4.5%
L 143338
 
4.3%
Other values (27) 1238531
37.0%
Other Punctuation
ValueCountFrequency (%)
. 768820
29.5%
" 514712
19.8%
; 485790
18.7%
, 340235
13.1%
: 278626
 
10.7%
% 69874
 
2.7%
/ 56498
 
2.2%
! 27039
 
1.0%
' 21725
 
0.8%
# 18216
 
0.7%
Other values (9) 20643
 
0.8%
Decimal Number
ValueCountFrequency (%)
1 422009
19.8%
2 276489
13.0%
0 240230
11.3%
9 233861
11.0%
3 188131
8.8%
7 169939
8.0%
5 159142
 
7.5%
6 155302
 
7.3%
4 150818
 
7.1%
8 137054
 
6.4%
Math Symbol
ValueCountFrequency (%)
= 12697
53.1%
+ 5742
24.0%
| 5218
21.8%
~ 112
 
0.5%
> 91
 
0.4%
< 43
 
0.2%
× 19
 
0.1%
± 4
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 1531
95.0%
46
 
2.9%
21
 
1.3%
© 9
 
0.6%
5
 
0.3%
Other Number
ValueCountFrequency (%)
½ 15
68.2%
¼ 2
 
9.1%
¹ 2
 
9.1%
¾ 2
 
9.1%
³ 1
 
4.5%
Dash Punctuation
ValueCountFrequency (%)
- 171764
99.5%
865
 
0.5%
9
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 82302
66.7%
{ 36447
29.6%
[ 4589
 
3.7%
Close Punctuation
ValueCountFrequency (%)
) 82267
66.7%
} 36442
29.6%
] 4576
 
3.7%
Final Punctuation
ValueCountFrequency (%)
195
97.0%
5
 
2.5%
» 1
 
0.5%
Nonspacing Mark
ValueCountFrequency (%)
́ 138
60.0%
̀ 46
 
20.0%
̧ 46
 
20.0%
Control
ValueCountFrequency (%)
202744
99.5%
1071
 
0.5%
Initial Punctuation
ValueCountFrequency (%)
185
99.5%
« 1
 
0.5%
Modifier Symbol
ValueCountFrequency (%)
^ 5
62.5%
´ 3
37.5%
Space Separator
ValueCountFrequency (%)
4357921
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3618
100.0%
Other Letter
ValueCountFrequency (%)
º 485
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 177
100.0%
Format
ValueCountFrequency (%)
 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 24192452
71.3%
Common 9745914
28.7%
Inherited 230
 
< 0.1%
Greek 44
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2347617
 
9.7%
o 1831790
 
7.6%
a 1822117
 
7.5%
i 1658183
 
6.9%
t 1581641
 
6.5%
n 1541699
 
6.4%
r 1401352
 
5.8%
s 1326121
 
5.5%
l 1310982
 
5.4%
d 887403
 
3.7%
Other values (88) 8483547
35.1%
Common
ValueCountFrequency (%)
4357921
44.7%
. 768820
 
7.9%
" 514712
 
5.3%
; 485790
 
5.0%
1 422009
 
4.3%
, 340235
 
3.5%
: 278626
 
2.9%
2 276489
 
2.8%
0 240230
 
2.5%
9 233861
 
2.4%
Other values (60) 1827221
18.7%
Inherited
ValueCountFrequency (%)
́ 138
60.0%
̀ 46
 
20.0%
̧ 46
 
20.0%
Greek
ValueCountFrequency (%)
μ 43
97.7%
π 1
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33933102
> 99.9%
None 3879
 
< 0.1%
Punctuation 1357
 
< 0.1%
Diacriticals 230
 
< 0.1%
Misc Symbols 72
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4357921
 
12.8%
e 2347617
 
6.9%
o 1831790
 
5.4%
a 1822117
 
5.4%
i 1658183
 
4.9%
t 1581641
 
4.7%
n 1541699
 
4.5%
r 1401352
 
4.1%
s 1326121
 
3.9%
l 1310982
 
3.9%
Other values (86) 14753679
43.5%
None
ValueCountFrequency (%)
° 1531
39.5%
º 485
 
12.5%
é 435
 
11.2%
í 370
 
9.5%
ñ 156
 
4.0%
á 151
 
3.9%
· 87
 
2.2%
ã 75
 
1.9%
ü 75
 
1.9%
ó 74
 
1.9%
Other values (54) 440
 
11.3%
Punctuation
ValueCountFrequency (%)
865
63.7%
195
 
14.4%
185
 
13.6%
74
 
5.5%
24
 
1.8%
9
 
0.7%
5
 
0.4%
Diacriticals
ValueCountFrequency (%)
́ 138
60.0%
̀ 46
 
20.0%
̧ 46
 
20.0%
Misc Symbols
ValueCountFrequency (%)
46
63.9%
21
29.2%
5
 
6.9%

organismName
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:02.277350image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4.5
Mean length4.5
Min length4

Characters and Unicode

Total characters9
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row69.0
2nd row720.0
ValueCountFrequency (%)
69.0 1
50.0%
720.0 1
50.0%
2025-01-14T11:39:02.381105image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3
33.3%
. 2
22.2%
6 1
 
11.1%
9 1
 
11.1%
7 1
 
11.1%
2 1
 
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
77.8%
Other Punctuation 2
 
22.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3
42.9%
6 1
 
14.3%
9 1
 
14.3%
7 1
 
14.3%
2 1
 
14.3%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3
33.3%
. 2
22.2%
6 1
 
11.1%
9 1
 
11.1%
7 1
 
11.1%
2 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3
33.3%
. 2
22.2%
6 1
 
11.1%
9 1
 
11.1%
7 1
 
11.1%
2 1
 
11.1%

verbatimLabel
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:02.434505image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters45
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowNorth America, Canada, Nunavut, Baffin Island
ValueCountFrequency (%)
north 1
16.7%
america 1
16.7%
canada 1
16.7%
nunavut 1
16.7%
baffin 1
16.7%
island 1
16.7%
2025-01-14T11:39:02.543873image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 7
15.6%
5
 
11.1%
n 4
 
8.9%
, 3
 
6.7%
i 2
 
4.4%
f 2
 
4.4%
u 2
 
4.4%
d 2
 
4.4%
N 2
 
4.4%
t 2
 
4.4%
Other values (13) 14
31.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 31
68.9%
Uppercase Letter 6
 
13.3%
Space Separator 5
 
11.1%
Other Punctuation 3
 
6.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 7
22.6%
n 4
12.9%
i 2
 
6.5%
f 2
 
6.5%
u 2
 
6.5%
d 2
 
6.5%
t 2
 
6.5%
r 2
 
6.5%
e 1
 
3.2%
c 1
 
3.2%
Other values (6) 6
19.4%
Uppercase Letter
ValueCountFrequency (%)
N 2
33.3%
C 1
16.7%
A 1
16.7%
B 1
16.7%
I 1
16.7%
Space Separator
ValueCountFrequency (%)
5
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 37
82.2%
Common 8
 
17.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 7
18.9%
n 4
 
10.8%
i 2
 
5.4%
f 2
 
5.4%
u 2
 
5.4%
d 2
 
5.4%
N 2
 
5.4%
t 2
 
5.4%
r 2
 
5.4%
e 1
 
2.7%
Other values (11) 11
29.7%
Common
ValueCountFrequency (%)
5
62.5%
, 3
37.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 7
15.6%
5
 
11.1%
n 4
 
8.9%
, 3
 
6.7%
i 2
 
4.4%
f 2
 
4.4%
u 2
 
4.4%
d 2
 
4.4%
N 2
 
4.4%
t 2
 
4.4%
Other values (13) 14
31.1%

materialSampleID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:02.592276image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters13
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowNorth America
ValueCountFrequency (%)
north 1
50.0%
america 1
50.0%
2025-01-14T11:39:02.698272image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 2
15.4%
N 1
7.7%
o 1
7.7%
t 1
7.7%
h 1
7.7%
1
7.7%
A 1
7.7%
m 1
7.7%
e 1
7.7%
i 1
7.7%
Other values (2) 2
15.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10
76.9%
Uppercase Letter 2
 
15.4%
Space Separator 1
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 2
20.0%
o 1
10.0%
t 1
10.0%
h 1
10.0%
m 1
10.0%
e 1
10.0%
i 1
10.0%
c 1
10.0%
a 1
10.0%
Uppercase Letter
ValueCountFrequency (%)
N 1
50.0%
A 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
92.3%
Common 1
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 2
16.7%
N 1
8.3%
o 1
8.3%
t 1
8.3%
h 1
8.3%
A 1
8.3%
m 1
8.3%
e 1
8.3%
i 1
8.3%
c 1
8.3%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 2
15.4%
N 1
7.7%
o 1
7.7%
t 1
7.7%
h 1
7.7%
1
7.7%
A 1
7.7%
m 1
7.7%
e 1
7.7%
i 1
7.7%
Other values (2) 2
15.4%

eventType
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing3814096
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:02.746588image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length8
Mean length8.333333333
Min length4

Characters and Unicode

Total characters25
Distinct characters19
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row-15.6527
2nd rowBaffin Island
3rd row5.83
ValueCountFrequency (%)
15.6527 1
25.0%
baffin 1
25.0%
island 1
25.0%
5.83 1
25.0%
2025-01-14T11:39:02.853335image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 3
 
12.0%
f 2
 
8.0%
n 2
 
8.0%
. 2
 
8.0%
a 2
 
8.0%
8 1
 
4.0%
d 1
 
4.0%
l 1
 
4.0%
s 1
 
4.0%
I 1
 
4.0%
Other values (9) 9
36.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10
40.0%
Decimal Number 9
36.0%
Other Punctuation 2
 
8.0%
Uppercase Letter 2
 
8.0%
Space Separator 1
 
4.0%
Dash Punctuation 1
 
4.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 3
33.3%
8 1
 
11.1%
1 1
 
11.1%
7 1
 
11.1%
2 1
 
11.1%
6 1
 
11.1%
3 1
 
11.1%
Lowercase Letter
ValueCountFrequency (%)
f 2
20.0%
n 2
20.0%
a 2
20.0%
d 1
10.0%
l 1
10.0%
s 1
10.0%
i 1
10.0%
Uppercase Letter
ValueCountFrequency (%)
I 1
50.0%
B 1
50.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13
52.0%
Latin 12
48.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 3
23.1%
. 2
15.4%
8 1
 
7.7%
1
 
7.7%
- 1
 
7.7%
1 1
 
7.7%
7 1
 
7.7%
2 1
 
7.7%
6 1
 
7.7%
3 1
 
7.7%
Latin
ValueCountFrequency (%)
f 2
16.7%
n 2
16.7%
a 2
16.7%
d 1
8.3%
l 1
8.3%
s 1
8.3%
I 1
8.3%
i 1
8.3%
B 1
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 3
 
12.0%
f 2
 
8.0%
n 2
 
8.0%
. 2
 
8.0%
a 2
 
8.0%
8 1
 
4.0%
d 1
 
4.0%
l 1
 
4.0%
s 1
 
4.0%
I 1
 
4.0%
Other values (9) 9
36.0%

fieldNumber
Text

Missing 

Distinct60674
Distinct (%)19.1%
Missing3496495
Missing (%)91.7%
Memory size29.1 MiB
2025-01-14T11:39:03.064167image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length97
Median length64
Mean length12.75618065
Min length1

Characters and Unicode

Total characters4051414
Distinct characters84
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27322 ?
Unique (%)8.6%

Sample

1st rowMMS-MAMES/B3:M4-4
2nd rowUSARP/EL/9/740/USC
3rd rowM165503; H.29-118
4th rowUSFC/A5151
5th rowUSARP/EL/6/369/USC
ValueCountFrequency (%)
vgs 7929
 
1.9%
mms-mafla/jar 7004
 
1.7%
jtw 5892
 
1.4%
bolland/rfb 3098
 
0.7%
bbc 2577
 
0.6%
2230
 
0.5%
humes 2193
 
0.5%
jpem 2085
 
0.5%
lwk 1727
 
0.4%
lk 1719
 
0.4%
Other values (57021) 377011
91.2%
2025-01-14T11:39:03.380612image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 305500
 
7.5%
S 290933
 
7.2%
- 276981
 
6.8%
1 218982
 
5.4%
M 216754
 
5.4%
0 202603
 
5.0%
A 192694
 
4.8%
2 191794
 
4.7%
C 163005
 
4.0%
3 134147
 
3.3%
Other values (74) 1858021
45.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1835614
45.3%
Decimal Number 1403440
34.6%
Other Punctuation 360416
 
8.9%
Dash Punctuation 276981
 
6.8%
Space Separator 95861
 
2.4%
Lowercase Letter 73645
 
1.8%
Connector Punctuation 3087
 
0.1%
Close Punctuation 1107
 
< 0.1%
Open Punctuation 1106
 
< 0.1%
Math Symbol 154
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 290933
15.8%
M 216754
11.8%
A 192694
10.5%
C 163005
 
8.9%
U 116671
 
6.4%
F 102454
 
5.6%
L 84366
 
4.6%
I 83059
 
4.5%
R 80833
 
4.4%
B 78620
 
4.3%
Other values (16) 426225
23.2%
Lowercase Letter
ValueCountFrequency (%)
e 11928
16.2%
r 11200
15.2%
a 10936
14.8%
o 5737
7.8%
l 4611
 
6.3%
i 4055
 
5.5%
u 3859
 
5.2%
s 3658
 
5.0%
t 3451
 
4.7%
m 3145
 
4.3%
Other values (16) 11065
15.0%
Other Punctuation
ValueCountFrequency (%)
/ 305500
84.8%
: 33445
 
9.3%
; 15106
 
4.2%
. 3639
 
1.0%
, 1521
 
0.4%
# 907
 
0.3%
\ 150
 
< 0.1%
? 57
 
< 0.1%
& 48
 
< 0.1%
' 24
 
< 0.1%
Other values (3) 19
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 218982
15.6%
0 202603
14.4%
2 191794
13.7%
3 134147
9.6%
5 132922
9.5%
4 116940
8.3%
7 111223
7.9%
6 108012
7.7%
8 96459
6.9%
9 90358
6.4%
Math Symbol
ValueCountFrequency (%)
+ 148
96.1%
= 6
 
3.9%
Dash Punctuation
ValueCountFrequency (%)
- 276981
100.0%
Space Separator
ValueCountFrequency (%)
95861
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3087
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1107
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1106
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%
Control
ValueCountFrequency (%)
 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2142155
52.9%
Latin 1909259
47.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 290933
15.2%
M 216754
11.4%
A 192694
 
10.1%
C 163005
 
8.5%
U 116671
 
6.1%
F 102454
 
5.4%
L 84366
 
4.4%
I 83059
 
4.4%
R 80833
 
4.2%
B 78620
 
4.1%
Other values (42) 499870
26.2%
Common
ValueCountFrequency (%)
/ 305500
14.3%
- 276981
12.9%
1 218982
10.2%
0 202603
9.5%
2 191794
9.0%
3 134147
 
6.3%
5 132922
 
6.2%
4 116940
 
5.5%
7 111223
 
5.2%
6 108012
 
5.0%
Other values (22) 343051
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4051412
> 99.9%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 305500
 
7.5%
S 290933
 
7.2%
- 276981
 
6.8%
1 218982
 
5.4%
M 216754
 
5.4%
0 202603
 
5.0%
A 192694
 
4.8%
2 191794
 
4.7%
C 163005
 
4.0%
3 134147
 
3.3%
Other values (73) 1858019
45.9%
Punctuation
ValueCountFrequency (%)
2
100.0%

eventDate
Text

Missing 

Distinct94419
Distinct (%)3.0%
Missing653351
Missing (%)17.1%
Memory size29.1 MiB
2025-01-14T11:39:03.610281image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length10
Mean length10.11858142
Min length4

Characters and Unicode

Total characters31982286
Distinct characters24
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21671 ?
Unique (%)0.7%

Sample

1st row1981-04-24
2nd row1952-03-30
3rd row1958-08-06
4th row1900-11
5th row1988-08-20
ValueCountFrequency (%)
or 3309
 
0.1%
1838/1842 3220
 
0.1%
1915 3013
 
0.1%
1913 2523
 
0.1%
1982-07-21 2373
 
0.1%
1891 2257
 
0.1%
1981-07-06 2204
 
0.1%
1983-05-13 2158
 
0.1%
1982-11-19 2081
 
0.1%
1916 2046
 
0.1%
Other values (92244) 3142184
99.2%
2025-01-14T11:39:03.917930image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 6096876
19.1%
1 6057096
18.9%
0 4860793
15.2%
9 4024283
12.6%
2 2319326
 
7.3%
8 1867324
 
5.8%
7 1478663
 
4.6%
6 1473117
 
4.6%
3 1249910
 
3.9%
5 1205209
 
3.8%
Other values (14) 1349689
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 25667726
80.3%
Dash Punctuation 6096876
 
19.1%
Other Punctuation 204437
 
0.6%
Space Separator 6620
 
< 0.1%
Lowercase Letter 6618
 
< 0.1%
Uppercase Letter 7
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 6057096
23.6%
0 4860793
18.9%
9 4024283
15.7%
2 2319326
 
9.0%
8 1867324
 
7.3%
7 1478663
 
5.8%
6 1473117
 
5.7%
3 1249910
 
4.9%
5 1205209
 
4.7%
4 1132005
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
G 2
28.6%
S 2
28.6%
W 1
14.3%
E 1
14.3%
P 1
14.3%
Other Punctuation
ValueCountFrequency (%)
/ 201012
98.3%
, 3424
 
1.7%
: 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
o 3309
50.0%
r 3309
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 6096876
100.0%
Space Separator
ValueCountFrequency (%)
6620
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 31975661
> 99.9%
Latin 6625
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 6096876
19.1%
1 6057096
18.9%
0 4860793
15.2%
9 4024283
12.6%
2 2319326
 
7.3%
8 1867324
 
5.8%
7 1478663
 
4.6%
6 1473117
 
4.6%
3 1249910
 
3.9%
5 1205209
 
3.8%
Other values (7) 1343064
 
4.2%
Latin
ValueCountFrequency (%)
o 3309
49.9%
r 3309
49.9%
G 2
 
< 0.1%
S 2
 
< 0.1%
W 1
 
< 0.1%
E 1
 
< 0.1%
P 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31982286
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 6096876
19.1%
1 6057096
18.9%
0 4860793
15.2%
9 4024283
12.6%
2 2319326
 
7.3%
8 1867324
 
5.8%
7 1478663
 
4.6%
6 1473117
 
4.6%
3 1249910
 
3.9%
5 1205209
 
3.8%
Other values (14) 1349689
 
4.2%

eventTime
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:03.972248image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowNunavut
ValueCountFrequency (%)
nunavut 1
100.0%
2025-01-14T11:39:04.074100image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
u 2
28.6%
N 1
14.3%
n 1
14.3%
a 1
14.3%
v 1
14.3%
t 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
85.7%
Uppercase Letter 1
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 2
33.3%
n 1
16.7%
a 1
16.7%
v 1
16.7%
t 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
N 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 2
28.6%
N 1
14.3%
n 1
14.3%
a 1
14.3%
v 1
14.3%
t 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u 2
28.6%
N 1
14.3%
n 1
14.3%
a 1
14.3%
v 1
14.3%
t 1
14.3%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)< 0.1%
Missing806907
Missing (%)21.2%
Memory size29.1 MiB
2025-01-14T11:39:04.276779image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.772580866
Min length1

Characters and Unicode

Total characters8337683
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row114
2nd row90
3rd row218
4th row334
5th row233
ValueCountFrequency (%)
212 37593
 
1.3%
243 33383
 
1.1%
181 32170
 
1.1%
151 30939
 
1.0%
120 24240
 
0.8%
213 22295
 
0.7%
273 20900
 
0.7%
90 20298
 
0.7%
334 19034
 
0.6%
304 18795
 
0.6%
Other values (356) 2747545
91.4%
2025-01-14T11:39:04.552253image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1695498
20.3%
2 1611644
19.3%
3 1027787
12.3%
4 641030
 
7.7%
5 618209
 
7.4%
0 582501
 
7.0%
6 556510
 
6.7%
9 547129
 
6.6%
8 530041
 
6.4%
7 527334
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8337683
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1695498
20.3%
2 1611644
19.3%
3 1027787
12.3%
4 641030
 
7.7%
5 618209
 
7.4%
0 582501
 
7.0%
6 556510
 
6.7%
9 547129
 
6.6%
8 530041
 
6.4%
7 527334
 
6.3%

Most occurring scripts

ValueCountFrequency (%)
Common 8337683
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1695498
20.3%
2 1611644
19.3%
3 1027787
12.3%
4 641030
 
7.7%
5 618209
 
7.4%
0 582501
 
7.0%
6 556510
 
6.7%
9 547129
 
6.6%
8 530041
 
6.4%
7 527334
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8337683
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1695498
20.3%
2 1611644
19.3%
3 1027787
12.3%
4 641030
 
7.7%
5 618209
 
7.4%
0 582501
 
7.0%
6 556510
 
6.7%
9 547129
 
6.6%
8 530041
 
6.4%
7 527334
 
6.3%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)< 0.1%
Missing805827
Missing (%)21.1%
Memory size29.1 MiB
2025-01-14T11:39:04.765749image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.773799377
Min length1

Characters and Unicode

Total characters8344343
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row114
2nd row90
3rd row218
4th row334
5th row233
ValueCountFrequency (%)
212 38231
 
1.3%
243 35053
 
1.2%
181 32432
 
1.1%
151 29225
 
1.0%
120 24168
 
0.8%
273 21866
 
0.7%
90 21275
 
0.7%
213 21123
 
0.7%
334 19989
 
0.7%
304 19955
 
0.7%
Other values (356) 2744955
91.2%
2025-01-14T11:39:05.049718image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1686177
20.2%
2 1613053
19.3%
3 1036459
12.4%
4 647209
 
7.8%
5 617934
 
7.4%
0 584245
 
7.0%
6 553055
 
6.6%
9 545035
 
6.5%
8 531209
 
6.4%
7 529967
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8344343
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1686177
20.2%
2 1613053
19.3%
3 1036459
12.4%
4 647209
 
7.8%
5 617934
 
7.4%
0 584245
 
7.0%
6 553055
 
6.6%
9 545035
 
6.5%
8 531209
 
6.4%
7 529967
 
6.4%

Most occurring scripts

ValueCountFrequency (%)
Common 8344343
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1686177
20.2%
2 1613053
19.3%
3 1036459
12.4%
4 647209
 
7.8%
5 617934
 
7.4%
0 584245
 
7.0%
6 553055
 
6.6%
9 545035
 
6.5%
8 531209
 
6.4%
7 529967
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8344343
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1686177
20.2%
2 1613053
19.3%
3 1036459
12.4%
4 647209
 
7.8%
5 617934
 
7.4%
0 584245
 
7.0%
6 553055
 
6.6%
9 545035
 
6.5%
8 531209
 
6.4%
7 529967
 
6.4%

year
Text

Missing 

Distinct322
Distinct (%)< 0.1%
Missing653351
Missing (%)17.1%
Memory size29.1 MiB
2025-01-14T11:39:05.301859image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69
Median length4
Mean length4.000020565
Min length4

Characters and Unicode

Total characters12643057
Distinct characters36
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)< 0.1%

Sample

1st row1981
2nd row1952
3rd row1958
4th row1900
5th row1988
ValueCountFrequency (%)
1966 58752
 
1.9%
1967 54040
 
1.7%
1964 53626
 
1.7%
1977 51065
 
1.6%
1968 50503
 
1.6%
1965 47521
 
1.5%
1969 45348
 
1.4%
1963 41039
 
1.3%
1970 40287
 
1.3%
1971 40050
 
1.3%
Other values (319) 2678524
84.7%
2025-01-14T11:39:05.595664image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3560563
28.2%
9 3272320
25.9%
8 1124797
 
8.9%
0 820664
 
6.5%
6 800837
 
6.3%
7 730947
 
5.8%
2 679649
 
5.4%
5 556593
 
4.4%
4 552089
 
4.4%
3 544529
 
4.3%
Other values (26) 69
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12642988
> 99.9%
Lowercase Letter 52
 
< 0.1%
Space Separator 7
 
< 0.1%
Uppercase Letter 7
 
< 0.1%
Other Punctuation 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 11
21.2%
i 6
11.5%
t 5
9.6%
e 4
 
7.7%
o 4
 
7.7%
n 4
 
7.7%
a 3
 
5.8%
s 3
 
5.8%
b 2
 
3.8%
l 2
 
3.8%
Other values (7) 8
15.4%
Decimal Number
ValueCountFrequency (%)
1 3560563
28.2%
9 3272320
25.9%
8 1124797
 
8.9%
0 820664
 
6.5%
6 800837
 
6.3%
7 730947
 
5.8%
2 679649
 
5.4%
5 556593
 
4.4%
4 552089
 
4.4%
3 544529
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
F 2
28.6%
D 2
28.6%
T 1
14.3%
N 1
14.3%
H 1
14.3%
Space Separator
ValueCountFrequency (%)
7
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 1
100.0%
Close Punctuation
ValueCountFrequency (%)
] 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12642998
> 99.9%
Latin 59
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 11
18.6%
i 6
 
10.2%
t 5
 
8.5%
e 4
 
6.8%
o 4
 
6.8%
n 4
 
6.8%
a 3
 
5.1%
s 3
 
5.1%
F 2
 
3.4%
b 2
 
3.4%
Other values (12) 15
25.4%
Common
ValueCountFrequency (%)
1 3560563
28.2%
9 3272320
25.9%
8 1124797
 
8.9%
0 820664
 
6.5%
6 800837
 
6.3%
7 730947
 
5.8%
2 679649
 
5.4%
5 556593
 
4.4%
4 552089
 
4.4%
3 544529
 
4.3%
Other values (4) 10
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12643057
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3560563
28.2%
9 3272320
25.9%
8 1124797
 
8.9%
0 820664
 
6.5%
6 800837
 
6.3%
7 730947
 
5.8%
2 679649
 
5.4%
5 556593
 
4.4%
4 552089
 
4.4%
3 544529
 
4.3%
Other values (26) 69
 
< 0.1%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing799915
Missing (%)21.0%
Memory size29.1 MiB
2025-01-14T11:39:05.666803image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.174196068
Min length1

Characters and Unicode

Total characters3539243
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row3
3rd row8
4th row11
5th row8
ValueCountFrequency (%)
7 391749
13.0%
8 361170
12.0%
6 326519
10.8%
5 309261
10.3%
4 251357
8.3%
9 250756
8.3%
3 228476
7.6%
10 201213
6.7%
2 197532
6.6%
11 182252
6.0%
Other values (2) 313899
10.4%
2025-01-14T11:39:05.782410image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 879616
24.9%
7 391749
11.1%
8 361170
10.2%
2 339126
 
9.6%
6 326519
 
9.2%
5 309261
 
8.7%
4 251357
 
7.1%
9 250756
 
7.1%
3 228476
 
6.5%
0 201213
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3539243
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 879616
24.9%
7 391749
11.1%
8 361170
10.2%
2 339126
 
9.6%
6 326519
 
9.2%
5 309261
 
8.7%
4 251357
 
7.1%
9 250756
 
7.1%
3 228476
 
6.5%
0 201213
 
5.7%

Most occurring scripts

ValueCountFrequency (%)
Common 3539243
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 879616
24.9%
7 391749
11.1%
8 361170
10.2%
2 339126
 
9.6%
6 326519
 
9.2%
5 309261
 
8.7%
4 251357
 
7.1%
9 250756
 
7.1%
3 228476
 
6.5%
0 201213
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3539243
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 879616
24.9%
7 391749
11.1%
8 361170
10.2%
2 339126
 
9.6%
6 326519
 
9.2%
5 309261
 
8.7%
4 251357
 
7.1%
9 250756
 
7.1%
3 228476
 
6.5%
0 201213
 
5.7%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing1074234
Missing (%)28.2%
Memory size29.1 MiB
2025-01-14T11:39:06.269805image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.706628246
Min length1

Characters and Unicode

Total characters4675931
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row24
2nd row30
3rd row6
4th row20
5th row8
ValueCountFrequency (%)
15 98711
 
3.6%
10 98499
 
3.6%
20 97124
 
3.5%
1 94601
 
3.5%
19 94164
 
3.4%
8 93400
 
3.4%
13 92580
 
3.4%
18 92566
 
3.4%
21 92198
 
3.4%
25 91203
 
3.3%
Other values (21) 1794819
65.5%
2025-01-14T11:39:06.410587image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1245266
26.6%
2 1151590
24.6%
3 393455
 
8.4%
5 279261
 
6.0%
0 274680
 
5.9%
8 274022
 
5.9%
6 267711
 
5.7%
7 265434
 
5.7%
4 264266
 
5.7%
9 260246
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4675931
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1245266
26.6%
2 1151590
24.6%
3 393455
 
8.4%
5 279261
 
6.0%
0 274680
 
5.9%
8 274022
 
5.9%
6 267711
 
5.7%
7 265434
 
5.7%
4 264266
 
5.7%
9 260246
 
5.6%

Most occurring scripts

ValueCountFrequency (%)
Common 4675931
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1245266
26.6%
2 1151590
24.6%
3 393455
 
8.4%
5 279261
 
6.0%
0 274680
 
5.9%
8 274022
 
5.9%
6 267711
 
5.7%
7 265434
 
5.7%
4 264266
 
5.7%
9 260246
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4675931
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1245266
26.6%
2 1151590
24.6%
3 393455
 
8.4%
5 279261
 
6.0%
0 274680
 
5.9%
8 274022
 
5.9%
6 267711
 
5.7%
7 265434
 
5.7%
4 264266
 
5.7%
9 260246
 
5.6%

verbatimEventDate
Text

Missing 

Distinct221213
Distinct (%)12.4%
Missing2027788
Missing (%)53.2%
Memory size29.1 MiB
2025-01-14T11:39:06.632032image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length194
Median length11
Mean length13.22104046
Min length1

Characters and Unicode

Total characters23616890
Distinct characters106
Distinct categories14 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique88101 ?
Unique (%)4.9%

Sample

1st row24 APR 1981
2nd row6 Aug 1958
3rd row24 Jun 1934
4th row24 Mar 1974
5th row23-29 January 1885
ValueCountFrequency (%)
704830
 
11.7%
00 328903
 
5.5%
0000 154437
 
2.6%
aug 152852
 
2.5%
may 151178
 
2.5%
jul 150920
 
2.5%
jun 135719
 
2.3%
apr 125753
 
2.1%
mar 116184
 
1.9%
sep 109403
 
1.8%
Other values (61813) 3870262
64.5%
2025-01-14T11:39:06.964102image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4214130
17.8%
1 2567975
 
10.9%
0 2139022
 
9.1%
9 1866296
 
7.9%
- 1720550
 
7.3%
2 976132
 
4.1%
8 743384
 
3.1%
6 653108
 
2.8%
7 593250
 
2.5%
3 531644
 
2.3%
Other values (96) 7611399
32.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11001535
46.6%
Space Separator 4214130
 
17.8%
Lowercase Letter 3904026
 
16.5%
Uppercase Letter 2168227
 
9.2%
Dash Punctuation 1720560
 
7.3%
Other Punctuation 577783
 
2.4%
Open Punctuation 15135
 
0.1%
Close Punctuation 15132
 
0.1%
Connector Punctuation 190
 
< 0.1%
Math Symbol 165
 
< 0.1%
Other values (4) 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 423036
10.8%
a 412743
10.6%
r 411061
10.5%
e 384156
 
9.8%
n 288728
 
7.4%
c 225164
 
5.8%
p 222891
 
5.7%
y 215954
 
5.5%
t 190592
 
4.9%
b 168645
 
4.3%
Other values (23) 961056
24.6%
Uppercase Letter
ValueCountFrequency (%)
J 406715
18.8%
A 362558
16.7%
M 284047
13.1%
N 147056
 
6.8%
S 138874
 
6.4%
O 125338
 
5.8%
F 111776
 
5.2%
T 86104
 
4.0%
U 75652
 
3.5%
D 75433
 
3.5%
Other values (18) 354674
16.4%
Other Punctuation
ValueCountFrequency (%)
/ 290713
50.3%
: 161224
27.9%
; 52096
 
9.0%
. 44672
 
7.7%
, 24797
 
4.3%
' 2444
 
0.4%
* 966
 
0.2%
? 463
 
0.1%
! 208
 
< 0.1%
& 155
 
< 0.1%
Other values (4) 45
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 2567975
23.3%
0 2139022
19.4%
9 1866296
17.0%
2 976132
 
8.9%
8 743384
 
6.8%
6 653108
 
5.9%
7 593250
 
5.4%
3 531644
 
4.8%
5 479032
 
4.4%
4 451692
 
4.1%
Math Symbol
ValueCountFrequency (%)
+ 74
44.8%
| 72
43.6%
= 12
 
7.3%
~ 3
 
1.8%
< 2
 
1.2%
± 2
 
1.2%
Open Punctuation
ValueCountFrequency (%)
[ 13901
91.8%
( 1231
 
8.1%
{ 3
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 13898
91.8%
) 1231
 
8.1%
} 3
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 1720550
> 99.9%
10
 
< 0.1%
Format
ValueCountFrequency (%)
2
66.7%
­ 1
33.3%
Space Separator
ValueCountFrequency (%)
4214130
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 190
100.0%
Other Number
ValueCountFrequency (%)
½ 2
100.0%
Other Symbol
ValueCountFrequency (%)
° 1
100.0%
Other Letter
ValueCountFrequency (%)
º 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17544636
74.3%
Latin 6072254
 
25.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 423036
 
7.0%
a 412743
 
6.8%
r 411061
 
6.8%
J 406715
 
6.7%
e 384156
 
6.3%
A 362558
 
6.0%
n 288728
 
4.8%
M 284047
 
4.7%
c 225164
 
3.7%
p 222891
 
3.7%
Other values (52) 2651155
43.7%
Common
ValueCountFrequency (%)
4214130
24.0%
1 2567975
14.6%
0 2139022
12.2%
9 1866296
10.6%
- 1720550
9.8%
2 976132
 
5.6%
8 743384
 
4.2%
6 653108
 
3.7%
7 593250
 
3.4%
3 531644
 
3.0%
Other values (34) 1539145
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23616832
> 99.9%
None 45
 
< 0.1%
Punctuation 13
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4214130
17.8%
1 2567975
 
10.9%
0 2139022
 
9.1%
9 1866296
 
7.9%
- 1720550
 
7.3%
2 976132
 
4.1%
8 743384
 
3.1%
6 653108
 
2.8%
7 593250
 
2.5%
3 531644
 
2.3%
Other values (79) 7611341
32.2%
None
ValueCountFrequency (%)
é 16
35.6%
û 8
17.8%
ü 4
 
8.9%
ô 3
 
6.7%
± 2
 
4.4%
ä 2
 
4.4%
Æ 2
 
4.4%
½ 2
 
4.4%
° 1
 
2.2%
º 1
 
2.2%
Other values (4) 4
 
8.9%
Punctuation
ValueCountFrequency (%)
10
76.9%
2
 
15.4%
1
 
7.7%

habitat
Text

Missing 

Distinct106103
Distinct (%)35.6%
Missing3516278
Missing (%)92.2%
Memory size29.1 MiB
2025-01-14T11:39:07.189854image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length37931
Median length533
Mean length30.99341215
Min length1

Characters and Unicode

Total characters9230489
Distinct characters142
Distinct categories18 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82257 ?
Unique (%)27.6%

Sample

1st rowabandoned field
2nd rowIn wet mixed hardwood-pine-podocarpus forest.
3rd rowEcological remarks by collector(s): yes
4th rowRainforest
5th rowTropical dry forest
ValueCountFrequency (%)
forest 70097
 
5.0%
on 40365
 
2.9%
and 34529
 
2.5%
in 33893
 
2.4%
with 24655
 
1.8%
of 24444
 
1.7%
by 24094
 
1.7%
remarks 20080
 
1.4%
ecological 20077
 
1.4%
collector(s 20073
 
1.4%
Other values (31427) 1094865
77.8%
2025-01-14T11:39:07.507937image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1107679
 
12.0%
e 823152
 
8.9%
a 709570
 
7.7%
o 679364
 
7.4%
r 608992
 
6.6%
s 585109
 
6.3%
n 520012
 
5.6%
i 459827
 
5.0%
t 445145
 
4.8%
l 408741
 
4.4%
Other values (132) 2882898
31.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7389357
80.1%
Space Separator 1107679
 
12.0%
Uppercase Letter 394546
 
4.3%
Other Punctuation 229067
 
2.5%
Decimal Number 29101
 
0.3%
Close Punctuation 25311
 
0.3%
Open Punctuation 25287
 
0.3%
Dash Punctuation 18502
 
0.2%
Control 9139
 
0.1%
Math Symbol 2404
 
< 0.1%
Other values (8) 96
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 823152
11.1%
a 709570
 
9.6%
o 679364
 
9.2%
r 608992
 
8.2%
s 585109
 
7.9%
n 520012
 
7.0%
i 459827
 
6.2%
t 445145
 
6.0%
l 408741
 
5.5%
d 328756
 
4.4%
Other values (45) 1820689
24.6%
Uppercase Letter
ValueCountFrequency (%)
S 44502
 
11.3%
E 32517
 
8.2%
M 31497
 
8.0%
C 24824
 
6.3%
R 24221
 
6.1%
P 23660
 
6.0%
O 22747
 
5.8%
F 21984
 
5.6%
A 21935
 
5.6%
T 21053
 
5.3%
Other values (21) 125606
31.8%
Other Punctuation
ValueCountFrequency (%)
, 92681
40.5%
. 88295
38.5%
: 23067
 
10.1%
; 12709
 
5.5%
& 4733
 
2.1%
/ 3557
 
1.6%
" 1718
 
0.7%
' 1153
 
0.5%
? 433
 
0.2%
% 339
 
0.1%
Other values (6) 382
 
0.2%
Decimal Number
ValueCountFrequency (%)
0 7724
26.5%
1 4005
13.8%
2 3469
11.9%
3 3407
11.7%
5 3160
10.9%
4 2310
 
7.9%
6 1580
 
5.4%
8 1334
 
4.6%
9 1101
 
3.8%
7 1011
 
3.5%
Math Symbol
ValueCountFrequency (%)
+ 1170
48.7%
~ 833
34.7%
| 204
 
8.5%
= 97
 
4.0%
± 71
 
3.0%
< 16
 
0.7%
> 12
 
0.5%
× 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 24930
98.5%
] 297
 
1.2%
} 84
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 24915
98.5%
[ 288
 
1.1%
{ 84
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
- 18482
99.9%
12
 
0.1%
8
 
< 0.1%
Control
ValueCountFrequency (%)
9091
99.5%
48
 
0.5%
Other Symbol
ValueCountFrequency (%)
° 51
98.1%
¦ 1
 
1.9%
Final Punctuation
ValueCountFrequency (%)
11
84.6%
2
 
15.4%
Space Separator
ValueCountFrequency (%)
1107679
100.0%
Other Letter
ValueCountFrequency (%)
º 11
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8
100.0%
Initial Punctuation
ValueCountFrequency (%)
8
100.0%
Currency Symbol
ValueCountFrequency (%)
£ 2
100.0%
Other Number
ValueCountFrequency (%)
² 1
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7783914
84.3%
Common 1446575
 
15.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 823152
 
10.6%
a 709570
 
9.1%
o 679364
 
8.7%
r 608992
 
7.8%
s 585109
 
7.5%
n 520012
 
6.7%
i 459827
 
5.9%
t 445145
 
5.7%
l 408741
 
5.3%
d 328756
 
4.2%
Other values (77) 2215246
28.5%
Common
ValueCountFrequency (%)
1107679
76.6%
, 92681
 
6.4%
. 88295
 
6.1%
) 24930
 
1.7%
( 24915
 
1.7%
: 23067
 
1.6%
- 18482
 
1.3%
; 12709
 
0.9%
9091
 
0.6%
0 7724
 
0.5%
Other values (45) 37002
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9217756
99.9%
None 12632
 
0.1%
Punctuation 101
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1107679
 
12.0%
e 823152
 
8.9%
a 709570
 
7.7%
o 679364
 
7.4%
r 608992
 
6.6%
s 585109
 
6.3%
n 520012
 
5.6%
i 459827
 
5.0%
t 445145
 
4.8%
l 408741
 
4.4%
Other values (84) 2870165
31.1%
None
ValueCountFrequency (%)
ú 1917
15.2%
ê 1816
14.4%
é 1812
14.3%
ó 1726
13.7%
í 1471
11.6%
á 1331
10.5%
ñ 1008
8.0%
è 660
 
5.2%
à 228
 
1.8%
ç 92
 
0.7%
Other values (32) 571
 
4.5%
Punctuation
ValueCountFrequency (%)
60
59.4%
12
 
11.9%
11
 
10.9%
8
 
7.9%
8
 
7.9%
2
 
2.0%

sampleSizeValue
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:07.567479image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row1000.0
ValueCountFrequency (%)
1000.0 1
100.0%
2025-01-14T11:39:07.670466image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4
66.7%
1 1
 
16.7%
. 1
 
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5
83.3%
Other Punctuation 1
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4
80.0%
1 1
 
20.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4
66.7%
1 1
 
16.7%
. 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4
66.7%
1 1
 
16.7%
. 1
 
16.7%

eventRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:07.719785image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowGPS
ValueCountFrequency (%)
gps 1
100.0%
2025-01-14T11:39:07.829670image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
G 1
33.3%
P 1
33.3%
S 1
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
G 1
33.3%
P 1
33.3%
S 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 3
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 1
33.3%
P 1
33.3%
S 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
G 1
33.3%
P 1
33.3%
S 1
33.3%

locationID
Text

Missing 

Distinct65557
Distinct (%)14.7%
Missing3366761
Missing (%)88.3%
Memory size29.1 MiB
2025-01-14T11:39:08.077116image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length49374
Median length131
Mean length4.671313414
Min length1

Characters and Unicode

Total characters2089656
Distinct characters101
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36003 ?
Unique (%)8.0%

Sample

1st row31
2nd rowGS 03383
3rd rowM4
4th row9
5th row68-36
ValueCountFrequency (%)
d 5711
 
1.1%
not 5048
 
1.0%
rec 4891
 
1.0%
4 3834
 
0.8%
1 3635
 
0.7%
rhb 3185
 
0.6%
rfb 3103
 
0.6%
2 2955
 
0.6%
3 2478
 
0.5%
6 2386
 
0.5%
Other values (55856) 469594
92.7%
2025-01-14T11:39:08.381904image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 229414
 
11.0%
2 190821
 
9.1%
0 156212
 
7.5%
- 139092
 
6.7%
5 138185
 
6.6%
3 138042
 
6.6%
4 132581
 
6.3%
6 118976
 
5.7%
7 95419
 
4.6%
8 87223
 
4.2%
Other values (91) 663691
31.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1366265
65.4%
Uppercase Letter 410919
 
19.7%
Dash Punctuation 139094
 
6.7%
Lowercase Letter 59759
 
2.9%
Space Separator 57287
 
2.7%
Other Punctuation 37590
 
1.8%
Control 11792
 
0.6%
Connector Punctuation 3314
 
0.2%
Open Punctuation 1728
 
0.1%
Close Punctuation 1547
 
0.1%
Other values (2) 361
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 7316
12.2%
t 7236
12.1%
o 6746
11.3%
e 6311
10.6%
i 5591
9.4%
n 4601
7.7%
r 4425
7.4%
l 2509
 
4.2%
c 2145
 
3.6%
u 2062
 
3.5%
Other values (26) 10817
18.1%
Uppercase Letter
ValueCountFrequency (%)
A 43233
 
10.5%
S 38365
 
9.3%
C 32592
 
7.9%
B 29728
 
7.2%
M 26503
 
6.4%
R 26076
 
6.3%
N 24656
 
6.0%
E 22030
 
5.4%
I 20351
 
5.0%
T 19501
 
4.7%
Other values (17) 127884
31.1%
Other Punctuation
ValueCountFrequency (%)
: 16051
42.7%
. 13690
36.4%
, 3696
 
9.8%
/ 2453
 
6.5%
# 688
 
1.8%
; 498
 
1.3%
& 252
 
0.7%
? 153
 
0.4%
' 51
 
0.1%
* 50
 
0.1%
Other values (4) 8
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 229414
16.8%
2 190821
14.0%
0 156212
11.4%
5 138185
10.1%
3 138042
10.1%
4 132581
9.7%
6 118976
8.7%
7 95419
7.0%
8 87223
 
6.4%
9 79392
 
5.8%
Math Symbol
ValueCountFrequency (%)
+ 315
87.7%
= 41
 
11.4%
| 3
 
0.8%
Dash Punctuation
ValueCountFrequency (%)
- 139092
> 99.9%
2
 
< 0.1%
Control
ValueCountFrequency (%)
11730
99.5%
62
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 1599
92.5%
[ 129
 
7.5%
Close Punctuation
ValueCountFrequency (%)
) 1418
91.7%
] 129
 
8.3%
Space Separator
ValueCountFrequency (%)
57287
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3314
100.0%
Other Symbol
ValueCountFrequency (%)
° 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1618978
77.5%
Latin 470678
 
22.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 43233
 
9.2%
S 38365
 
8.2%
C 32592
 
6.9%
B 29728
 
6.3%
M 26503
 
5.6%
R 26076
 
5.5%
N 24656
 
5.2%
E 22030
 
4.7%
I 20351
 
4.3%
T 19501
 
4.1%
Other values (53) 187643
39.9%
Common
ValueCountFrequency (%)
1 229414
14.2%
2 190821
11.8%
0 156212
9.6%
- 139092
8.6%
5 138185
8.5%
3 138042
8.5%
4 132581
8.2%
6 118976
7.3%
7 95419
5.9%
8 87223
 
5.4%
Other values (28) 193013
11.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2089634
> 99.9%
None 20
 
< 0.1%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 229414
 
11.0%
2 190821
 
9.1%
0 156212
 
7.5%
- 139092
 
6.7%
5 138185
 
6.6%
3 138042
 
6.6%
4 132581
 
6.3%
6 118976
 
5.7%
7 95419
 
4.6%
8 87223
 
4.2%
Other values (78) 663669
31.8%
None
ValueCountFrequency (%)
ä 3
15.0%
é 3
15.0%
á 2
10.0%
í 2
10.0%
° 2
10.0%
ü 2
10.0%
ã 1
 
5.0%
å 1
 
5.0%
ö 1
 
5.0%
è 1
 
5.0%
Other values (2) 2
10.0%
Punctuation
ValueCountFrequency (%)
2
100.0%

higherGeography
Text

Missing 

Distinct56561
Distinct (%)1.5%
Missing118692
Missing (%)3.1%
Memory size29.1 MiB
2025-01-14T11:39:08.597290image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length177
Median length138
Mean length40.43501785
Min length4

Characters and Unicode

Total characters149423848
Distinct characters186
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17764 ?
Unique (%)0.5%

Sample

1st rowNorth Atlantic Ocean, Caribbean Sea, Belize
2nd rowNorth America, United States, Tennessee
3rd rowNorth America, United States, West Virginia, Randolph
4th rowUnited States, Georgia, Decatur County
5th rowNorth Atlantic Ocean, Gulf of Mexico, United States
ValueCountFrequency (%)
america 1838540
 
9.2%
north 1785934
 
8.9%
united 1390693
 
6.9%
states 1378096
 
6.9%
712178
 
3.6%
south 711734
 
3.5%
ocean 694520
 
3.5%
neotropics 659180
 
3.3%
atlantic 362469
 
1.8%
pacific 345353
 
1.7%
Other values (18600) 10176334
50.7%
2025-01-14T11:39:08.895339image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16359624
 
10.9%
a 14443845
 
9.7%
i 10914554
 
7.3%
e 10709030
 
7.2%
t 10494122
 
7.0%
r 8233580
 
5.5%
o 8056092
 
5.4%
, 7690870
 
5.1%
n 7438148
 
5.0%
c 6015663
 
4.0%
Other values (176) 49068320
32.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 104713822
70.1%
Uppercase Letter 19392029
 
13.0%
Space Separator 16359624
 
10.9%
Other Punctuation 7803699
 
5.2%
Dash Punctuation 965382
 
0.6%
Open Punctuation 94370
 
0.1%
Close Punctuation 94348
 
0.1%
Modifier Letter 221
 
< 0.1%
Math Symbol 170
 
< 0.1%
Decimal Number 111
 
< 0.1%
Other values (3) 72
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 14443845
13.8%
i 10914554
10.4%
e 10709030
10.2%
t 10494122
10.0%
r 8233580
7.9%
o 8056092
7.7%
n 7438148
 
7.1%
c 6015663
 
5.7%
s 5269580
 
5.0%
l 3634298
 
3.5%
Other values (88) 19504910
18.6%
Uppercase Letter
ValueCountFrequency (%)
A 3309730
17.1%
N 2853613
14.7%
S 2781807
14.3%
U 1497885
7.7%
C 1419390
7.3%
P 1041591
 
5.4%
M 903434
 
4.7%
O 861595
 
4.4%
I 651469
 
3.4%
B 542484
 
2.8%
Other values (39) 3529031
18.2%
Other Punctuation
ValueCountFrequency (%)
, 7690870
98.6%
. 70709
 
0.9%
' 28187
 
0.4%
/ 10594
 
0.1%
? 2720
 
< 0.1%
; 452
 
< 0.1%
& 74
 
< 0.1%
* 46
 
< 0.1%
: 41
 
< 0.1%
¡ 3
 
< 0.1%
Other values (2) 3
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 29
26.1%
2 27
24.3%
3 22
19.8%
0 17
15.3%
4 7
 
6.3%
6 3
 
2.7%
8 2
 
1.8%
9 2
 
1.8%
7 2
 
1.8%
Math Symbol
ValueCountFrequency (%)
= 161
94.7%
+ 7
 
4.1%
| 1
 
0.6%
~ 1
 
0.6%
Dash Punctuation
ValueCountFrequency (%)
- 965160
> 99.9%
221
 
< 0.1%
1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 76332
80.9%
( 18038
 
19.1%
Close Punctuation
ValueCountFrequency (%)
] 76308
80.9%
) 18040
 
19.1%
Modifier Letter
ValueCountFrequency (%)
ʻ 194
87.8%
ʼ 27
 
12.2%
Modifier Symbol
ValueCountFrequency (%)
´ 52
98.1%
¸ 1
 
1.9%
Space Separator
ValueCountFrequency (%)
16359624
100.0%
Format
ValueCountFrequency (%)
17
100.0%
Nonspacing Mark
ValueCountFrequency (%)
́ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 124105851
83.1%
Common 25317995
 
16.9%
Inherited 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 14443845
 
11.6%
i 10914554
 
8.8%
e 10709030
 
8.6%
t 10494122
 
8.5%
r 8233580
 
6.6%
o 8056092
 
6.5%
n 7438148
 
6.0%
c 6015663
 
4.8%
s 5269580
 
4.2%
l 3634298
 
2.9%
Other values (137) 38896939
31.3%
Common
ValueCountFrequency (%)
16359624
64.6%
, 7690870
30.4%
- 965160
 
3.8%
[ 76332
 
0.3%
] 76308
 
0.3%
. 70709
 
0.3%
' 28187
 
0.1%
) 18040
 
0.1%
( 18038
 
0.1%
/ 10594
 
< 0.1%
Other values (28) 4133
 
< 0.1%
Inherited
ValueCountFrequency (%)
́ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 149209059
99.9%
None 214262
 
0.1%
Punctuation 239
 
< 0.1%
Modifier Letters 221
 
< 0.1%
Latin Ext Additional 65
 
< 0.1%
Diacriticals 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
16359624
 
11.0%
a 14443845
 
9.7%
i 10914554
 
7.3%
e 10709030
 
7.2%
t 10494122
 
7.0%
r 8233580
 
5.5%
o 8056092
 
5.4%
, 7690870
 
5.2%
n 7438148
 
5.0%
c 6015663
 
4.0%
Other values (72) 48853531
32.7%
None
ValueCountFrequency (%)
á 69039
32.2%
í 40375
18.8%
é 36868
17.2%
ó 26693
 
12.5%
ã 13821
 
6.5%
ô 6264
 
2.9%
ç 3624
 
1.7%
ñ 3240
 
1.5%
Î 2734
 
1.3%
ü 2671
 
1.2%
Other values (71) 8933
 
4.2%
Punctuation
ValueCountFrequency (%)
221
92.5%
17
 
7.1%
1
 
0.4%
Modifier Letters
ValueCountFrequency (%)
ʻ 194
87.8%
ʼ 27
 
12.2%
Latin Ext Additional
ValueCountFrequency (%)
18
27.7%
10
15.4%
6
 
9.2%
5
 
7.7%
5
 
7.7%
ế 4
 
6.2%
3
 
4.6%
3
 
4.6%
3
 
4.6%
1
 
1.5%
Other values (7) 7
 
10.8%
Diacriticals
ValueCountFrequency (%)
́ 2
100.0%

continent
Text

Missing 

Distinct195
Distinct (%)< 0.1%
Missing534327
Missing (%)14.0%
Memory size29.1 MiB
2025-01-14T11:39:08.976907image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length60
Median length57
Mean length16.28692421
Min length4

Characters and Unicode

Total characters53417398
Distinct characters43
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)< 0.1%

Sample

1st rowNorth Atlantic Ocean
2nd rowNorth America
3rd rowNorth America
4th rowNorth Atlantic Ocean
5th rowAsia
ValueCountFrequency (%)
america 1838505
23.1%
north 1713311
21.5%
ocean 692905
 
8.7%
659976
 
8.3%
neotropics 659180
 
8.3%
south 633934
 
7.9%
atlantic 361973
 
4.5%
pacific 344558
 
4.3%
africa 139956
 
1.8%
asia-tropical 124686
 
1.6%
Other values (29) 806425
10.1%
2025-01-14T11:39:09.129560image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 4847298
 
9.1%
4695637
 
8.8%
c 4576576
 
8.6%
i 4342913
 
8.1%
a 4262577
 
8.0%
t 4129784
 
7.7%
o 3913099
 
7.3%
e 3909963
 
7.3%
A 2706413
 
5.1%
N 2372489
 
4.4%
Other values (33) 13660649
25.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 40274803
75.4%
Uppercase Letter 7528783
 
14.1%
Space Separator 4695637
 
8.8%
Dash Punctuation 872720
 
1.6%
Other Punctuation 45452
 
0.1%
Decimal Number 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 4847298
12.0%
c 4576576
11.4%
i 4342913
10.8%
a 4262577
10.6%
t 4129784
10.3%
o 3913099
9.7%
e 3909963
9.7%
h 2348709
5.8%
m 1927373
 
4.8%
n 1468719
 
3.6%
Other values (11) 4547792
11.3%
Uppercase Letter
ValueCountFrequency (%)
A 2706413
35.9%
N 2372489
31.5%
O 706244
 
9.4%
S 635469
 
8.4%
P 344622
 
4.6%
T 213478
 
2.8%
I 207748
 
2.8%
C 112108
 
1.5%
W 109235
 
1.5%
E 107549
 
1.4%
Other values (3) 13428
 
0.2%
Other Punctuation
ValueCountFrequency (%)
, 44654
98.2%
/ 550
 
1.2%
? 247
 
0.5%
. 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
6 1
33.3%
3 1
33.3%
0 1
33.3%
Space Separator
ValueCountFrequency (%)
4695637
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 872720
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 47803586
89.5%
Common 5613812
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 4847298
10.1%
c 4576576
9.6%
i 4342913
9.1%
a 4262577
8.9%
t 4129784
 
8.6%
o 3913099
 
8.2%
e 3909963
 
8.2%
A 2706413
 
5.7%
N 2372489
 
5.0%
h 2348709
 
4.9%
Other values (24) 10393765
21.7%
Common
ValueCountFrequency (%)
4695637
83.6%
- 872720
 
15.5%
, 44654
 
0.8%
/ 550
 
< 0.1%
? 247
 
< 0.1%
6 1
 
< 0.1%
3 1
 
< 0.1%
. 1
 
< 0.1%
0 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 53417398
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 4847298
 
9.1%
4695637
 
8.8%
c 4576576
 
8.6%
i 4342913
 
8.1%
a 4262577
 
8.0%
t 4129784
 
7.7%
o 3913099
 
7.3%
e 3909963
 
7.3%
A 2706413
 
5.1%
N 2372489
 
4.4%
Other values (33) 13660649
25.6%

waterBody
Text

Missing 

Distinct2959
Distinct (%)0.4%
Missing3107446
Missing (%)81.5%
Memory size29.1 MiB
2025-01-14T11:39:09.317009image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length75
Median length73
Mean length24.15769409
Min length4

Characters and Unicode

Total characters17071107
Distinct characters74
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1175 ?
Unique (%)0.2%

Sample

1st rowNorth Atlantic Ocean, Caribbean Sea
2nd rowNorth Atlantic Ocean, Gulf of Mexico
3rd rowNorth Atlantic Ocean, Gulf of Mexico, Galveston Bay
4th rowNorth Pacific Ocean, Gulf of California
5th rowNorth Atlantic Ocean, Gulf of Guinea
ValueCountFrequency (%)
ocean 692937
26.0%
north 526630
19.8%
atlantic 362005
13.6%
pacific 281664
10.6%
of 114420
 
4.3%
sea 113745
 
4.3%
gulf 112638
 
4.2%
south 98866
 
3.7%
mexico 87826
 
3.3%
caribbean 51340
 
1.9%
Other values (2054) 223154
 
8.4%
2025-01-14T11:39:09.590979image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1958572
11.5%
a 1779883
10.4%
c 1769422
10.4%
t 1425954
 
8.4%
n 1282457
 
7.5%
i 1197252
 
7.0%
e 1020751
 
6.0%
o 873786
 
5.1%
O 697406
 
4.1%
r 658055
 
3.9%
Other values (64) 4407569
25.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12299427
72.0%
Uppercase Letter 2553914
 
15.0%
Space Separator 1958572
 
11.5%
Other Punctuation 258083
 
1.5%
Dash Punctuation 849
 
< 0.1%
Modifier Letter 186
 
< 0.1%
Open Punctuation 38
 
< 0.1%
Close Punctuation 38
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1779883
14.5%
c 1769422
14.4%
t 1425954
11.6%
n 1282457
10.4%
i 1197252
9.7%
e 1020751
8.3%
o 873786
7.1%
r 658055
 
5.4%
h 646399
 
5.3%
f 515487
 
4.2%
Other values (23) 1129981
9.2%
Uppercase Letter
ValueCountFrequency (%)
O 697406
27.3%
N 528030
20.7%
A 393486
15.4%
P 291650
11.4%
S 234718
 
9.2%
G 115417
 
4.5%
M 104639
 
4.1%
C 74920
 
2.9%
B 42905
 
1.7%
I 37263
 
1.5%
Other values (16) 33480
 
1.3%
Other Punctuation
ValueCountFrequency (%)
, 256772
99.5%
; 447
 
0.2%
' 340
 
0.1%
. 265
 
0.1%
/ 195
 
0.1%
? 36
 
< 0.1%
: 27
 
< 0.1%
* 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 37
97.4%
[ 1
 
2.6%
Close Punctuation
ValueCountFrequency (%)
) 37
97.4%
] 1
 
2.6%
Space Separator
ValueCountFrequency (%)
1958572
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 849
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 186
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14853341
87.0%
Common 2217766
 
13.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1779883
12.0%
c 1769422
11.9%
t 1425954
9.6%
n 1282457
 
8.6%
i 1197252
 
8.1%
e 1020751
 
6.9%
o 873786
 
5.9%
O 697406
 
4.7%
r 658055
 
4.4%
h 646399
 
4.4%
Other values (49) 3501976
23.6%
Common
ValueCountFrequency (%)
1958572
88.3%
, 256772
 
11.6%
- 849
 
< 0.1%
; 447
 
< 0.1%
' 340
 
< 0.1%
. 265
 
< 0.1%
/ 195
 
< 0.1%
ʻ 186
 
< 0.1%
( 37
 
< 0.1%
) 37
 
< 0.1%
Other values (5) 66
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17070508
> 99.9%
None 413
 
< 0.1%
Modifier Letters 186
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1958572
11.5%
a 1779883
10.4%
c 1769422
10.4%
t 1425954
 
8.4%
n 1282457
 
7.5%
i 1197252
 
7.0%
e 1020751
 
6.0%
o 873786
 
5.1%
O 697406
 
4.1%
r 658055
 
3.9%
Other values (55) 4406970
25.8%
Modifier Letters
ValueCountFrequency (%)
ʻ 186
100.0%
None
ValueCountFrequency (%)
ā 186
45.0%
í 87
21.1%
á 62
 
15.0%
ñ 34
 
8.2%
é 21
 
5.1%
ó 15
 
3.6%
è 6
 
1.5%
É 2
 
0.5%

islandGroup
Text

Missing 

Distinct711
Distinct (%)0.8%
Missing3729526
Missing (%)97.8%
Memory size29.1 MiB
2025-01-14T11:39:09.795352image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length41
Mean length14.65497263
Min length4

Characters and Unicode

Total characters1239415
Distinct characters69
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique142 ?
Unique (%)0.2%

Sample

1st rowPelican Cays
2nd rowGreater Antilles
3rd rowStewart Islands
4th rowRalik Chain
5th rowVirgin Islands
ValueCountFrequency (%)
islands 29499
 
16.0%
antilles 14047
 
7.6%
greater 13778
 
7.5%
group 12534
 
6.8%
is 8170
 
4.4%
leeward 4503
 
2.4%
new 3902
 
2.1%
hispaniola 3745
 
2.0%
chain 3337
 
1.8%
virgin 2783
 
1.5%
Other values (590) 87985
47.7%
2025-01-14T11:39:10.069076image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 144757
 
11.7%
s 111752
 
9.0%
99710
 
8.0%
n 91421
 
7.4%
l 89010
 
7.2%
e 86875
 
7.0%
r 74416
 
6.0%
i 63025
 
5.1%
d 51906
 
4.2%
t 45327
 
3.7%
Other values (59) 381216
30.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 944900
76.2%
Uppercase Letter 181881
 
14.7%
Space Separator 99710
 
8.0%
Other Punctuation 8968
 
0.7%
Open Punctuation 1958
 
0.2%
Close Punctuation 1958
 
0.2%
Dash Punctuation 18
 
< 0.1%
Format 17
 
< 0.1%
Math Symbol 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 144757
15.3%
s 111752
11.8%
n 91421
9.7%
l 89010
9.4%
e 86875
9.2%
r 74416
7.9%
i 63025
6.7%
d 51906
 
5.5%
t 45327
 
4.8%
o 41461
 
4.4%
Other values (20) 144950
15.3%
Uppercase Letter
ValueCountFrequency (%)
I 40709
22.4%
G 32103
17.7%
A 17580
9.7%
C 13790
 
7.6%
V 9621
 
5.3%
L 9140
 
5.0%
S 8922
 
4.9%
B 6699
 
3.7%
N 5730
 
3.2%
H 5523
 
3.0%
Other values (17) 32064
17.6%
Other Punctuation
ValueCountFrequency (%)
. 8168
91.1%
' 784
 
8.7%
, 13
 
0.1%
? 3
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 1133
57.9%
[ 825
42.1%
Close Punctuation
ValueCountFrequency (%)
) 1133
57.9%
] 825
42.1%
Space Separator
ValueCountFrequency (%)
99710
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%
Format
ValueCountFrequency (%)
17
100.0%
Math Symbol
ValueCountFrequency (%)
= 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1126781
90.9%
Common 112634
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 144757
12.8%
s 111752
 
9.9%
n 91421
 
8.1%
l 89010
 
7.9%
e 86875
 
7.7%
r 74416
 
6.6%
i 63025
 
5.6%
d 51906
 
4.6%
t 45327
 
4.0%
o 41461
 
3.7%
Other values (47) 326831
29.0%
Common
ValueCountFrequency (%)
99710
88.5%
. 8168
 
7.3%
( 1133
 
1.0%
) 1133
 
1.0%
[ 825
 
0.7%
] 825
 
0.7%
' 784
 
0.7%
- 18
 
< 0.1%
17
 
< 0.1%
, 13
 
< 0.1%
Other values (2) 8
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1235671
99.7%
None 3727
 
0.3%
Punctuation 17
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 144757
 
11.7%
s 111752
 
9.0%
99710
 
8.1%
n 91421
 
7.4%
l 89010
 
7.2%
e 86875
 
7.0%
r 74416
 
6.0%
i 63025
 
5.1%
d 51906
 
4.2%
t 45327
 
3.7%
Other values (52) 377472
30.5%
None
ValueCountFrequency (%)
Î 1933
51.9%
á 1755
47.1%
Ō 30
 
0.8%
ñ 7
 
0.2%
ù 1
 
< 0.1%
à 1
 
< 0.1%
Punctuation
ValueCountFrequency (%)
17
100.0%

island
Text

Missing 

Distinct4691
Distinct (%)1.8%
Missing3560499
Missing (%)93.4%
Memory size29.1 MiB
2025-01-14T11:39:10.270988image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length47
Median length41
Mean length9.538844637
Min length2

Characters and Unicode

Total characters2419051
Distinct characters87
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1368 ?
Unique (%)0.5%

Sample

1st rowHonshu
2nd rowLana'i
3rd rowCat Cay
4th rowHawaii
5th rowSumatra
ValueCountFrequency (%)
island 42237
 
10.8%
hispaniola 20799
 
5.3%
cuba 10640
 
2.7%
oahu 9896
 
2.5%
atoll 8952
 
2.3%
luzon 8577
 
2.2%
new 7682
 
2.0%
bermuda 6749
 
1.7%
guinea 6114
 
1.6%
st 6066
 
1.6%
Other values (3576) 261632
67.2%
2025-01-14T11:39:10.544254image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 373003
15.4%
n 172836
 
7.1%
i 161013
 
6.7%
o 150399
 
6.2%
135744
 
5.6%
l 133792
 
5.5%
e 121984
 
5.0%
u 119529
 
4.9%
s 109905
 
4.5%
r 96589
 
4.0%
Other values (77) 844257
34.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1864763
77.1%
Uppercase Letter 382282
 
15.8%
Space Separator 135744
 
5.6%
Other Punctuation 18988
 
0.8%
Close Punctuation 8062
 
0.3%
Open Punctuation 8058
 
0.3%
Dash Punctuation 1145
 
< 0.1%
Decimal Number 8
 
< 0.1%
Modifier Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 373003
20.0%
n 172836
9.3%
i 161013
8.6%
o 150399
8.1%
l 133792
 
7.2%
e 121984
 
6.5%
u 119529
 
6.4%
s 109905
 
5.9%
r 96589
 
5.2%
d 86087
 
4.6%
Other values (33) 339626
18.2%
Uppercase Letter
ValueCountFrequency (%)
I 54105
14.2%
C 36416
 
9.5%
H 35901
 
9.4%
S 29181
 
7.6%
B 28985
 
7.6%
M 24615
 
6.4%
T 17913
 
4.7%
A 17328
 
4.5%
G 17269
 
4.5%
L 17032
 
4.5%
Other values (18) 103537
27.1%
Other Punctuation
ValueCountFrequency (%)
. 9244
48.7%
' 9091
47.9%
, 582
 
3.1%
? 56
 
0.3%
/ 15
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 3
37.5%
3 3
37.5%
2 1
 
12.5%
6 1
 
12.5%
Open Punctuation
ValueCountFrequency (%)
[ 5868
72.8%
( 2190
 
27.2%
Close Punctuation
ValueCountFrequency (%)
] 5867
72.8%
) 2195
 
27.2%
Space Separator
ValueCountFrequency (%)
135744
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1145
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2247045
92.9%
Common 172006
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 373003
16.6%
n 172836
 
7.7%
i 161013
 
7.2%
o 150399
 
6.7%
l 133792
 
6.0%
e 121984
 
5.4%
u 119529
 
5.3%
s 109905
 
4.9%
r 96589
 
4.3%
d 86087
 
3.8%
Other values (61) 721908
32.1%
Common
ValueCountFrequency (%)
135744
78.9%
. 9244
 
5.4%
' 9091
 
5.3%
[ 5868
 
3.4%
] 5867
 
3.4%
) 2195
 
1.3%
( 2190
 
1.3%
- 1145
 
0.7%
, 582
 
0.3%
? 56
 
< 0.1%
Other values (6) 24
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2416110
99.9%
None 2934
 
0.1%
Latin Ext Additional 6
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 373003
15.4%
n 172836
 
7.2%
i 161013
 
6.7%
o 150399
 
6.2%
135744
 
5.6%
l 133792
 
5.5%
e 121984
 
5.0%
u 119529
 
4.9%
s 109905
 
4.5%
r 96589
 
4.0%
Other values (56) 841316
34.8%
None
ValueCountFrequency (%)
ç 739
25.2%
Î 657
22.4%
é 407
13.9%
ó 393
13.4%
á 298
10.2%
â 151
 
5.1%
ñ 126
 
4.3%
ã 67
 
2.3%
Ö 26
 
0.9%
í 20
 
0.7%
Other values (9) 50
 
1.7%
Latin Ext Additional
ValueCountFrequency (%)
6
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 1
100.0%

country
Text

Missing 

Distinct802
Distinct (%)< 0.1%
Missing160727
Missing (%)4.2%
Memory size29.1 MiB
2025-01-14T11:39:10.756109image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length57
Median length51
Mean length10.00230992
Min length1

Characters and Unicode

Total characters36542159
Distinct characters71
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique181 ?
Unique (%)< 0.1%

Sample

1st rowBelize
2nd rowUnited States
3rd rowUnited States
4th rowUnited States
5th rowUnited States
ValueCountFrequency (%)
united 1389854
25.2%
states 1377250
25.0%
mexico 187807
 
3.4%
brazil 153884
 
2.8%
philippines 110607
 
2.0%
colombia 95091
 
1.7%
canada 81065
 
1.5%
panama 78531
 
1.4%
venezuela 70949
 
1.3%
china 65037
 
1.2%
Other values (523) 1897614
34.5%
2025-01-14T11:39:11.036301image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 4574640
12.5%
a 4243119
11.6%
e 4040895
11.1%
i 3253675
 
8.9%
n 2814016
 
7.7%
s 1918948
 
5.3%
d 1880183
 
5.1%
1854317
 
5.1%
S 1517111
 
4.2%
U 1433557
 
3.9%
Other values (61) 9011698
24.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 29173572
79.8%
Uppercase Letter 5475863
 
15.0%
Space Separator 1854317
 
5.1%
Other Punctuation 28773
 
0.1%
Open Punctuation 3859
 
< 0.1%
Close Punctuation 3859
 
< 0.1%
Dash Punctuation 1916
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 4574640
15.7%
a 4243119
14.5%
e 4040895
13.9%
i 3253675
11.2%
n 2814016
9.6%
s 1918948
6.6%
d 1880183
6.4%
o 1001931
 
3.4%
l 897555
 
3.1%
r 829433
 
2.8%
Other values (22) 3719177
12.7%
Uppercase Letter
ValueCountFrequency (%)
S 1517111
27.7%
U 1433557
26.2%
C 382045
 
7.0%
P 357091
 
6.5%
M 278504
 
5.1%
B 256862
 
4.7%
I 148509
 
2.7%
G 148370
 
2.7%
A 148237
 
2.7%
R 112346
 
2.1%
Other values (15) 693231
12.7%
Other Punctuation
ValueCountFrequency (%)
. 14847
51.6%
, 10327
35.9%
' 1418
 
4.9%
/ 1333
 
4.6%
? 835
 
2.9%
: 11
 
< 0.1%
* 1
 
< 0.1%
; 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 2919
75.6%
( 940
 
24.4%
Close Punctuation
ValueCountFrequency (%)
] 2919
75.6%
) 940
 
24.4%
Space Separator
ValueCountFrequency (%)
1854317
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1916
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 34649435
94.8%
Common 1892724
 
5.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 4574640
13.2%
a 4243119
12.2%
e 4040895
11.7%
i 3253675
9.4%
n 2814016
 
8.1%
s 1918948
 
5.5%
d 1880183
 
5.4%
S 1517111
 
4.4%
U 1433557
 
4.1%
o 1001931
 
2.9%
Other values (47) 7971360
23.0%
Common
ValueCountFrequency (%)
1854317
98.0%
. 14847
 
0.8%
, 10327
 
0.5%
[ 2919
 
0.2%
] 2919
 
0.2%
- 1916
 
0.1%
' 1418
 
0.1%
/ 1333
 
0.1%
( 940
 
< 0.1%
) 940
 
< 0.1%
Other values (4) 848
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36538579
> 99.9%
None 3580
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 4574640
12.5%
a 4243119
11.6%
e 4040895
11.1%
i 3253675
 
8.9%
n 2814016
 
7.7%
s 1918948
 
5.3%
d 1880183
 
5.1%
1854317
 
5.1%
S 1517111
 
4.2%
U 1433557
 
3.9%
Other values (55) 9008118
24.7%
None
ValueCountFrequency (%)
é 1652
46.1%
ç 1023
28.6%
ã 403
 
11.3%
í 374
 
10.4%
á 100
 
2.8%
ô 28
 
0.8%

stateProvince
Text

Missing 

Distinct7976
Distinct (%)0.3%
Missing1028496
Missing (%)27.0%
Memory size29.1 MiB
2025-01-14T11:39:11.242051image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69
Median length52
Mean length9.274103668
Min length1

Characters and Unicode

Total characters25833971
Distinct characters158
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1947 ?
Unique (%)0.1%

Sample

1st rowTennessee
2nd rowWest Virginia
3rd rowGeorgia
4th rowMaine
5th rowTexas
ValueCountFrequency (%)
california 149080
 
4.0%
florida 127806
 
3.5%
virginia 102361
 
2.8%
new 80179
 
2.2%
carolina 80093
 
2.2%
north 67306
 
1.8%
texas 65503
 
1.8%
alaska 63769
 
1.7%
massachusetts 58634
 
1.6%
maryland 49985
 
1.4%
Other values (5671) 2857870
77.2%
2025-01-14T11:39:11.529487image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3896884
15.1%
i 2209345
 
8.6%
n 1894407
 
7.3%
o 1887302
 
7.3%
r 1678217
 
6.5%
e 1361541
 
5.3%
s 1218320
 
4.7%
l 1101834
 
4.3%
t 976954
 
3.8%
916983
 
3.5%
Other values (148) 8692184
33.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 21070008
81.6%
Uppercase Letter 3685221
 
14.3%
Space Separator 916983
 
3.5%
Dash Punctuation 71055
 
0.3%
Other Punctuation 47247
 
0.2%
Open Punctuation 21634
 
0.1%
Close Punctuation 21629
 
0.1%
Math Symbol 133
 
< 0.1%
Decimal Number 32
 
< 0.1%
Modifier Letter 27
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3896884
18.5%
i 2209345
10.5%
n 1894407
9.0%
o 1887302
9.0%
r 1678217
8.0%
e 1361541
 
6.5%
s 1218320
 
5.8%
l 1101834
 
5.2%
t 976954
 
4.6%
u 771612
 
3.7%
Other values (75) 4073592
19.3%
Uppercase Letter
ValueCountFrequency (%)
C 530171
14.4%
M 367758
 
10.0%
N 283252
 
7.7%
S 281251
 
7.6%
A 269736
 
7.3%
P 209980
 
5.7%
T 183009
 
5.0%
V 162815
 
4.4%
F 155248
 
4.2%
B 127439
 
3.5%
Other values (34) 1114562
30.2%
Other Punctuation
ValueCountFrequency (%)
. 31198
66.0%
/ 6290
 
13.3%
' 4974
 
10.5%
, 3610
 
7.6%
? 1123
 
2.4%
& 47
 
0.1%
* 3
 
< 0.1%
: 1
 
< 0.1%
¡ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
3 19
59.4%
4 5
 
15.6%
8 2
 
6.2%
2 2
 
6.2%
9 2
 
6.2%
6 1
 
3.1%
7 1
 
3.1%
Math Symbol
ValueCountFrequency (%)
= 126
94.7%
+ 6
 
4.5%
| 1
 
0.8%
Dash Punctuation
ValueCountFrequency (%)
- 71037
> 99.9%
18
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 12376
57.2%
( 9258
42.8%
Close Punctuation
ValueCountFrequency (%)
] 12373
57.2%
) 9256
42.8%
Modifier Symbol
ValueCountFrequency (%)
´ 1
50.0%
¸ 1
50.0%
Space Separator
ValueCountFrequency (%)
916983
100.0%
Modifier Letter
ValueCountFrequency (%)
ʼ 27
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 24755229
95.8%
Common 1078742
 
4.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3896884
15.7%
i 2209345
 
8.9%
n 1894407
 
7.7%
o 1887302
 
7.6%
r 1678217
 
6.8%
e 1361541
 
5.5%
s 1218320
 
4.9%
l 1101834
 
4.5%
t 976954
 
3.9%
u 771612
 
3.1%
Other values (119) 7758813
31.3%
Common
ValueCountFrequency (%)
916983
85.0%
- 71037
 
6.6%
. 31198
 
2.9%
[ 12376
 
1.1%
] 12373
 
1.1%
( 9258
 
0.9%
) 9256
 
0.9%
/ 6290
 
0.6%
' 4974
 
0.5%
, 3610
 
0.3%
Other values (19) 1387
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25662145
99.3%
None 171747
 
0.7%
Latin Ext Additional 34
 
< 0.1%
Modifier Letters 27
 
< 0.1%
Punctuation 18
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3896884
15.2%
i 2209345
 
8.6%
n 1894407
 
7.4%
o 1887302
 
7.4%
r 1678217
 
6.5%
e 1361541
 
5.3%
s 1218320
 
4.7%
l 1101834
 
4.3%
t 976954
 
3.8%
916983
 
3.6%
Other values (66) 8520358
33.2%
None
ValueCountFrequency (%)
á 60711
35.3%
í 35016
20.4%
é 28069
16.3%
ó 20470
 
11.9%
ã 10442
 
6.1%
ô 5625
 
3.3%
ñ 2577
 
1.5%
ü 2189
 
1.3%
ä 1170
 
0.7%
å 914
 
0.5%
Other values (58) 4564
 
2.7%
Modifier Letters
ValueCountFrequency (%)
ʼ 27
100.0%
Punctuation
ValueCountFrequency (%)
18
100.0%
Latin Ext Additional
ValueCountFrequency (%)
10
29.4%
5
14.7%
3
 
8.8%
3
 
8.8%
ế 3
 
8.8%
3
 
8.8%
2
 
5.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
Other values (2) 2
 
5.9%

county
Text

Missing 

Distinct15792
Distinct (%)1.8%
Missing2948235
Missing (%)77.3%
Memory size29.1 MiB
2025-01-14T11:39:11.741657image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length56
Median length46
Mean length10.23553814
Min length1

Characters and Unicode

Total characters8862584
Distinct characters134
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4648 ?
Unique (%)0.5%

Sample

1st rowRandolph
2nd rowDecatur County
3rd rowPenobscot
4th rowGalveston County
5th rowDona Ana
ValueCountFrequency (%)
county 144252
 
10.8%
not 54273
 
4.1%
stated 54273
 
4.1%
san 21268
 
1.6%
prince 14567
 
1.1%
montgomery 13322
 
1.0%
district 13084
 
1.0%
santa 11912
 
0.9%
honolulu 11830
 
0.9%
11178
 
0.8%
Other values (11130) 986539
73.8%
2025-01-14T11:39:12.030811image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 840435
 
9.5%
o 723050
 
8.2%
n 688955
 
7.8%
e 672877
 
7.6%
t 608721
 
6.9%
r 479065
 
5.4%
470634
 
5.3%
i 444139
 
5.0%
u 379380
 
4.3%
l 341812
 
3.9%
Other values (124) 3213516
36.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6918803
78.1%
Uppercase Letter 1317635
 
14.9%
Space Separator 470634
 
5.3%
Open Punctuation 58067
 
0.7%
Close Punctuation 58046
 
0.7%
Other Punctuation 21597
 
0.2%
Dash Punctuation 17642
 
0.2%
Decimal Number 68
 
< 0.1%
Modifier Symbol 51
 
< 0.1%
Math Symbol 32
 
< 0.1%
Other values (2) 9
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 840435
12.1%
o 723050
10.5%
n 688955
10.0%
e 672877
9.7%
t 608721
8.8%
r 479065
 
6.9%
i 444139
 
6.4%
u 379380
 
5.5%
l 341812
 
4.9%
s 302994
 
4.4%
Other values (54) 1437375
20.8%
Uppercase Letter
ValueCountFrequency (%)
C 256790
19.5%
S 157540
12.0%
M 106824
 
8.1%
N 84019
 
6.4%
B 77068
 
5.8%
P 76996
 
5.8%
A 63198
 
4.8%
L 53016
 
4.0%
H 52717
 
4.0%
D 51718
 
3.9%
Other values (32) 337749
25.6%
Other Punctuation
ValueCountFrequency (%)
' 11576
53.6%
. 6965
32.2%
/ 2212
 
10.2%
? 444
 
2.1%
, 323
 
1.5%
* 41
 
0.2%
& 27
 
0.1%
; 4
 
< 0.1%
: 2
 
< 0.1%
¡ 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 29
42.6%
2 24
35.3%
0 13
19.1%
4 2
 
2.9%
Math Symbol
ValueCountFrequency (%)
= 30
93.8%
+ 1
 
3.1%
~ 1
 
3.1%
Open Punctuation
ValueCountFrequency (%)
[ 54306
93.5%
( 3761
 
6.5%
Close Punctuation
ValueCountFrequency (%)
] 54286
93.5%
) 3760
 
6.5%
Dash Punctuation
ValueCountFrequency (%)
- 17439
98.8%
203
 
1.2%
Space Separator
ValueCountFrequency (%)
470634
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 51
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 7
100.0%
Nonspacing Mark
ValueCountFrequency (%)
́ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8236438
92.9%
Common 626144
 
7.1%
Inherited 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 840435
 
10.2%
o 723050
 
8.8%
n 688955
 
8.4%
e 672877
 
8.2%
t 608721
 
7.4%
r 479065
 
5.8%
i 444139
 
5.4%
u 379380
 
4.6%
l 341812
 
4.1%
s 302994
 
3.7%
Other values (96) 2755010
33.4%
Common
ValueCountFrequency (%)
470634
75.2%
[ 54306
 
8.7%
] 54286
 
8.7%
- 17439
 
2.8%
' 11576
 
1.8%
. 6965
 
1.1%
( 3761
 
0.6%
) 3760
 
0.6%
/ 2212
 
0.4%
? 444
 
0.1%
Other values (17) 761
 
0.1%
Inherited
ValueCountFrequency (%)
́ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8833904
99.7%
None 28461
 
0.3%
Punctuation 203
 
< 0.1%
Modifier Letters 7
 
< 0.1%
Latin Ext Additional 7
 
< 0.1%
Diacriticals 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 840435
 
9.5%
o 723050
 
8.2%
n 688955
 
7.8%
e 672877
 
7.6%
t 608721
 
6.9%
r 479065
 
5.4%
470634
 
5.3%
i 444139
 
5.0%
u 379380
 
4.3%
l 341812
 
3.9%
Other values (65) 3184836
36.1%
None
ValueCountFrequency (%)
á 6113
21.5%
é 5084
17.9%
í 4771
16.8%
ó 4193
14.7%
ã 2909
10.2%
ç 1848
 
6.5%
è 601
 
2.1%
ô 591
 
2.1%
ñ 496
 
1.7%
ü 481
 
1.7%
Other values (41) 1374
 
4.8%
Punctuation
ValueCountFrequency (%)
203
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 7
100.0%
Latin Ext Additional
ValueCountFrequency (%)
3
42.9%
1
 
14.3%
1
 
14.3%
ế 1
 
14.3%
1
 
14.3%
Diacriticals
ValueCountFrequency (%)
́ 2
100.0%

municipality
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:12.090490image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length23
Mean length23
Min length23

Characters and Unicode

Total characters23
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowDegrees Minutes Seconds
ValueCountFrequency (%)
degrees 1
33.3%
minutes 1
33.3%
seconds 1
33.3%
2025-01-14T11:39:12.193069image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 5
21.7%
s 3
13.0%
2
 
8.7%
n 2
 
8.7%
D 1
 
4.3%
g 1
 
4.3%
r 1
 
4.3%
M 1
 
4.3%
i 1
 
4.3%
u 1
 
4.3%
Other values (5) 5
21.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18
78.3%
Uppercase Letter 3
 
13.0%
Space Separator 2
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5
27.8%
s 3
16.7%
n 2
 
11.1%
g 1
 
5.6%
r 1
 
5.6%
i 1
 
5.6%
u 1
 
5.6%
t 1
 
5.6%
c 1
 
5.6%
o 1
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
D 1
33.3%
M 1
33.3%
S 1
33.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21
91.3%
Common 2
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5
23.8%
s 3
14.3%
n 2
 
9.5%
D 1
 
4.8%
g 1
 
4.8%
r 1
 
4.8%
M 1
 
4.8%
i 1
 
4.8%
u 1
 
4.8%
t 1
 
4.8%
Other values (4) 4
19.0%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 5
21.7%
s 3
13.0%
2
 
8.7%
n 2
 
8.7%
D 1
 
4.3%
g 1
 
4.3%
r 1
 
4.3%
M 1
 
4.3%
i 1
 
4.3%
u 1
 
4.3%
Other values (5) 5
21.7%

locality
Text

Missing 

Distinct1351410
Distinct (%)41.3%
Missing544962
Missing (%)14.3%
Memory size29.1 MiB
2025-01-14T11:39:12.633428image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length140152
Median length426
Mean length40.30902559
Min length1

Characters and Unicode

Total characters131775727
Distinct characters359
Distinct categories21 ?
Distinct scripts5 ?
Distinct blocks16 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1062019 ?
Unique (%)32.5%

Sample

1st rowCarrie Bow Cay, Spur And Groove Zone
2nd rowEastern edge of Nashville, Davidson County.
3rd rowMonongahela National Forest, 1.2-1.4 mi (by road) W of Bear Heaven Campground, on road to Bickle Knob
4th rowHales Landing, Flint River about 7 miles below Bainbridge, basal Chattahoochee Formation, Oligocene, Vicksburgian
5th rowOrono
ValueCountFrequency (%)
of 1091981
 
5.1%
de 280973
 
1.3%
island 276714
 
1.3%
km 234570
 
1.1%
on 205256
 
1.0%
near 197075
 
0.9%
the 184862
 
0.9%
road 183771
 
0.9%
mi 174123
 
0.8%
and 171053
 
0.8%
Other values (427681) 18346488
85.9%
2025-01-14T11:39:13.320368image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18057579
 
13.7%
a 12014815
 
9.1%
e 8972168
 
6.8%
o 8710903
 
6.6%
n 7329088
 
5.6%
i 6648936
 
5.0%
r 6396546
 
4.9%
t 5851047
 
4.4%
l 4796222
 
3.6%
s 4619692
 
3.5%
Other values (349) 48378731
36.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 89732453
68.1%
Space Separator 18057579
 
13.7%
Uppercase Letter 15046232
 
11.4%
Other Punctuation 5929390
 
4.5%
Decimal Number 1979217
 
1.5%
Open Punctuation 302503
 
0.2%
Close Punctuation 301254
 
0.2%
Dash Punctuation 279930
 
0.2%
Control 108177
 
0.1%
Math Symbol 24799
 
< 0.1%
Other values (11) 14193
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 12014815
13.4%
e 8972168
10.0%
o 8710903
9.7%
n 7329088
 
8.2%
i 6648936
 
7.4%
r 6396546
 
7.1%
t 5851047
 
6.5%
l 4796222
 
5.3%
s 4619692
 
5.1%
u 3306189
 
3.7%
Other values (145) 21086847
23.5%
Uppercase Letter
ValueCountFrequency (%)
S 1539532
 
10.2%
C 1531103
 
10.2%
M 1059073
 
7.0%
P 1021572
 
6.8%
R 999458
 
6.6%
B 928277
 
6.2%
N 860328
 
5.7%
A 723736
 
4.8%
L 658662
 
4.4%
I 650936
 
4.3%
Other values (75) 5073555
33.7%
Other Punctuation
ValueCountFrequency (%)
, 2750592
46.4%
. 2643574
44.6%
: 201217
 
3.4%
; 131007
 
2.2%
' 94383
 
1.6%
" 44710
 
0.8%
/ 32011
 
0.5%
& 18568
 
0.3%
? 5972
 
0.1%
# 5738
 
0.1%
Other values (9) 1618
 
< 0.1%
Control
ValueCountFrequency (%)
107476
99.4%
570
 
0.5%
 31
 
< 0.1%
 25
 
< 0.1%
 23
 
< 0.1%
 19
 
< 0.1%
 15
 
< 0.1%
 11
 
< 0.1%
 4
 
< 0.1%
 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 396666
20.0%
2 278718
14.1%
0 272453
13.8%
5 230645
11.7%
3 198601
10.0%
4 158189
 
8.0%
6 141329
 
7.1%
7 108515
 
5.5%
8 103465
 
5.2%
9 90636
 
4.6%
Math Symbol
ValueCountFrequency (%)
= 16935
68.3%
+ 3758
 
15.2%
± 2194
 
8.8%
~ 698
 
2.8%
> 661
 
2.7%
< 490
 
2.0%
| 52
 
0.2%
7
 
< 0.1%
3
 
< 0.1%
1
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 3220
99.3%
9
 
0.3%
4
 
0.1%
4
 
0.1%
© 1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
¦ 1
 
< 0.1%
1
 
< 0.1%
Other Number
ValueCountFrequency (%)
½ 5249
66.6%
¼ 2255
28.6%
¾ 305
 
3.9%
² 35
 
0.4%
25
 
0.3%
³ 3
 
< 0.1%
3
 
< 0.1%
2
 
< 0.1%
Format
ValueCountFrequency (%)
­ 60
84.5%
2
 
2.8%
2
 
2.8%
2
 
2.8%
 2
 
2.8%
1
 
1.4%
1
 
1.4%
1
 
1.4%
Nonspacing Mark
ValueCountFrequency (%)
̩ 2
20.0%
̈ 2
20.0%
̄ 2
20.0%
̌ 1
10.0%
1
10.0%
́ 1
10.0%
ͤ 1
10.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 187
96.4%
3
 
1.5%
1
 
0.5%
1
 
0.5%
1
 
0.5%
1
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 227265
75.1%
[ 74898
 
24.8%
161
 
0.1%
{ 94
 
< 0.1%
85
 
< 0.1%
Modifier Symbol
ValueCountFrequency (%)
´ 209
89.3%
¨ 13
 
5.6%
^ 10
 
4.3%
¯ 1
 
0.4%
˚ 1
 
0.4%
Currency Symbol
ValueCountFrequency (%)
¢ 58
55.2%
¤ 32
30.5%
$ 7
 
6.7%
£ 7
 
6.7%
¥ 1
 
1.0%
Dash Punctuation
ValueCountFrequency (%)
- 279899
> 99.9%
22
 
< 0.1%
9
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 226165
75.1%
] 74983
 
24.9%
} 106
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
» 357
90.4%
30
 
7.6%
8
 
2.0%
Initial Punctuation
ValueCountFrequency (%)
« 349
84.1%
65
 
15.7%
1
 
0.2%
Other Letter
ValueCountFrequency (%)
º 1340
97.5%
ª 34
 
2.5%
Space Separator
ValueCountFrequency (%)
18057579
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 276
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 104780003
79.5%
Common 26995650
 
20.5%
Greek 62
 
< 0.1%
Inherited 11
 
< 0.1%
Cyrillic 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 12014815
 
11.5%
e 8972168
 
8.6%
o 8710903
 
8.3%
n 7329088
 
7.0%
i 6648936
 
6.3%
r 6396546
 
6.1%
t 5851047
 
5.6%
l 4796222
 
4.6%
s 4619692
 
4.4%
u 3306189
 
3.2%
Other values (225) 36134397
34.5%
Common
ValueCountFrequency (%)
18057579
66.9%
, 2750592
 
10.2%
. 2643574
 
9.8%
1 396666
 
1.5%
- 279899
 
1.0%
2 278718
 
1.0%
0 272453
 
1.0%
5 230645
 
0.9%
( 227265
 
0.8%
) 226165
 
0.8%
Other values (94) 1632094
 
6.0%
Greek
ValueCountFrequency (%)
λ 14
22.6%
ν 11
17.7%
η 7
11.3%
Κ 7
11.3%
ή 7
11.3%
υ 7
11.3%
Π 2
 
3.2%
ω 2
 
3.2%
ρ 2
 
3.2%
ά 2
 
3.2%
Inherited
ValueCountFrequency (%)
̩ 2
18.2%
̈ 2
18.2%
̄ 2
18.2%
1
9.1%
̌ 1
9.1%
1
9.1%
́ 1
9.1%
ͤ 1
9.1%
Cyrillic
ValueCountFrequency (%)
ӗ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 131387252
99.7%
None 387747
 
0.3%
Punctuation 440
 
< 0.1%
Modifier Letters 188
 
< 0.1%
Number Forms 30
 
< 0.1%
Latin Ext Additional 18
 
< 0.1%
Box Drawing 15
 
< 0.1%
Diacriticals 9
 
< 0.1%
Arrows 8
 
< 0.1%
Phonetic Ext 7
 
< 0.1%
Other values (6) 13
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
18057579
 
13.7%
a 12014815
 
9.1%
e 8972168
 
6.8%
o 8710903
 
6.6%
n 7329088
 
5.6%
i 6648936
 
5.1%
r 6396546
 
4.9%
t 5851047
 
4.5%
l 4796222
 
3.7%
s 4619692
 
3.5%
Other values (90) 47990256
36.5%
None
ValueCountFrequency (%)
í 96918
25.0%
á 69327
17.9%
é 46538
12.0%
ó 38861
10.0%
ñ 19435
 
5.0%
ã 15351
 
4.0%
ú 10979
 
2.8%
ç 9491
 
2.4%
ü 7701
 
2.0%
ä 7079
 
1.8%
Other values (192) 66067
17.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 187
99.5%
˚ 1
 
0.5%
Punctuation
ValueCountFrequency (%)
161
36.6%
85
19.3%
65
14.8%
49
 
11.1%
30
 
6.8%
22
 
5.0%
9
 
2.0%
8
 
1.8%
2
 
0.5%
2
 
0.5%
Other values (6) 7
 
1.6%
Number Forms
ValueCountFrequency (%)
25
83.3%
3
 
10.0%
2
 
6.7%
Box Drawing
ValueCountFrequency (%)
9
60.0%
4
26.7%
1
 
6.7%
1
 
6.7%
Arrows
ValueCountFrequency (%)
7
87.5%
1
 
12.5%
Block Elements
ValueCountFrequency (%)
4
80.0%
1
 
20.0%
Latin Ext Additional
ValueCountFrequency (%)
3
16.7%
ḿ 3
16.7%
2
11.1%
2
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
ế 1
 
5.6%
1
 
5.6%
Other values (2) 2
11.1%
Phonetic Ext
ValueCountFrequency (%)
3
42.9%
1
 
14.3%
1
 
14.3%
1
 
14.3%
1
 
14.3%
Math Operators
ValueCountFrequency (%)
3
100.0%
Diacriticals
ValueCountFrequency (%)
̩ 2
22.2%
̈ 2
22.2%
̄ 2
22.2%
̌ 1
11.1%
́ 1
11.1%
ͤ 1
11.1%
IPA Ext
ValueCountFrequency (%)
ɶ 2
100.0%
Diacriticals Sup
ValueCountFrequency (%)
1
100.0%
Cyrillic
ValueCountFrequency (%)
ӗ 1
100.0%
Greek Ext
ValueCountFrequency (%)
1
100.0%

verbatimLocality
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing3814096
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:13.382655image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length9
Min length3

Characters and Unicode

Total characters27
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row78 50' 50" W
2nd row4.6
3rd row79 51'48.5"W
ValueCountFrequency (%)
50 2
28.6%
78 1
14.3%
w 1
14.3%
4.6 1
14.3%
79 1
14.3%
51'48.5"w 1
14.3%
2025-01-14T11:39:13.486051image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
14.8%
5 4
14.8%
7 2
7.4%
8 2
7.4%
0 2
7.4%
' 2
7.4%
" 2
7.4%
W 2
7.4%
4 2
7.4%
. 2
7.4%
Other values (3) 3
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15
55.6%
Other Punctuation 6
 
22.2%
Space Separator 4
 
14.8%
Uppercase Letter 2
 
7.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 4
26.7%
7 2
13.3%
8 2
13.3%
0 2
13.3%
4 2
13.3%
6 1
 
6.7%
9 1
 
6.7%
1 1
 
6.7%
Other Punctuation
ValueCountFrequency (%)
' 2
33.3%
" 2
33.3%
. 2
33.3%
Space Separator
ValueCountFrequency (%)
4
100.0%
Uppercase Letter
ValueCountFrequency (%)
W 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 25
92.6%
Latin 2
 
7.4%

Most frequent character per script

Common
ValueCountFrequency (%)
4
16.0%
5 4
16.0%
7 2
8.0%
8 2
8.0%
0 2
8.0%
' 2
8.0%
" 2
8.0%
4 2
8.0%
. 2
8.0%
6 1
 
4.0%
Other values (2) 2
8.0%
Latin
ValueCountFrequency (%)
W 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4
14.8%
5 4
14.8%
7 2
7.4%
8 2
7.4%
0 2
7.4%
' 2
7.4%
" 2
7.4%
W 2
7.4%
4 2
7.4%
. 2
7.4%
Other values (3) 3
11.1%
Distinct4481
Distinct (%)0.5%
Missing2930460
Missing (%)76.8%
Memory size29.1 MiB
2025-01-14T11:39:13.695380image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length9
Mean length5.322696259
Min length3

Characters and Unicode

Total characters4703342
Distinct characters27
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique679 ?
Unique (%)0.1%

Sample

1st row1049.0
2nd row140.0
3rd row2880.0
4th row1219.0
5th row1100.0
ValueCountFrequency (%)
1000.0 14755
 
1.7%
100.0 14661
 
1.7%
200.0 14161
 
1.6%
300.0 11873
 
1.3%
500.0 11745
 
1.3%
1500.0 11723
 
1.3%
0.0 10988
 
1.2%
800.0 10953
 
1.2%
900.0 10515
 
1.2%
400.0 10358
 
1.2%
Other values (4452) 761911
86.2%
2025-01-14T11:39:13.987175image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1744284
37.1%
. 883637
18.8%
1 466563
 
9.9%
2 333783
 
7.1%
5 269825
 
5.7%
3 228074
 
4.8%
4 183692
 
3.9%
6 159411
 
3.4%
7 150513
 
3.2%
8 147530
 
3.1%
Other values (17) 136030
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3819512
81.2%
Other Punctuation 883637
 
18.8%
Dash Punctuation 147
 
< 0.1%
Lowercase Letter 36
 
< 0.1%
Uppercase Letter 6
 
< 0.1%
Space Separator 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 10
27.8%
s 6
16.7%
n 4
 
11.1%
g 2
 
5.6%
r 2
 
5.6%
i 2
 
5.6%
u 2
 
5.6%
t 2
 
5.6%
c 2
 
5.6%
o 2
 
5.6%
Decimal Number
ValueCountFrequency (%)
0 1744284
45.7%
1 466563
 
12.2%
2 333783
 
8.7%
5 269825
 
7.1%
3 228074
 
6.0%
4 183692
 
4.8%
6 159411
 
4.2%
7 150513
 
3.9%
8 147530
 
3.9%
9 135837
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
D 2
33.3%
M 2
33.3%
S 2
33.3%
Other Punctuation
ValueCountFrequency (%)
. 883637
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 147
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4703300
> 99.9%
Latin 42
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 10
23.8%
s 6
14.3%
n 4
 
9.5%
D 2
 
4.8%
g 2
 
4.8%
r 2
 
4.8%
M 2
 
4.8%
i 2
 
4.8%
u 2
 
4.8%
t 2
 
4.8%
Other values (4) 8
19.0%
Common
ValueCountFrequency (%)
0 1744284
37.1%
. 883637
18.8%
1 466563
 
9.9%
2 333783
 
7.1%
5 269825
 
5.7%
3 228074
 
4.8%
4 183692
 
3.9%
6 159411
 
3.4%
7 150513
 
3.2%
8 147530
 
3.1%
Other values (3) 135988
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4703342
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1744284
37.1%
. 883637
18.8%
1 466563
 
9.9%
2 333783
 
7.1%
5 269825
 
5.7%
3 228074
 
4.8%
4 183692
 
3.9%
6 159411
 
3.4%
7 150513
 
3.2%
8 147530
 
3.1%
Other values (17) 136030
 
2.9%
Distinct2781
Distinct (%)0.8%
Missing3486461
Missing (%)91.4%
Memory size29.1 MiB
2025-01-14T11:39:14.208790image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.334207265
Min length3

Characters and Unicode

Total characters1747689
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique622 ?
Unique (%)0.2%

Sample

1st row1146.0
2nd row24.0
3rd row2000.0
4th row600.0
5th row700.0
ValueCountFrequency (%)
1000.0 5570
 
1.7%
1500.0 4985
 
1.5%
600.0 4930
 
1.5%
500.0 4852
 
1.5%
200.0 4632
 
1.4%
900.0 4315
 
1.3%
1200.0 4276
 
1.3%
100.0 4187
 
1.3%
300.0 4027
 
1.2%
400.0 3889
 
1.2%
Other values (2767) 281975
86.1%
2025-01-14T11:39:14.516722image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 651948
37.3%
. 327638
18.7%
1 178996
 
10.2%
2 118924
 
6.8%
5 96009
 
5.5%
3 85495
 
4.9%
4 68593
 
3.9%
6 61434
 
3.5%
7 55678
 
3.2%
8 52906
 
3.0%
Other values (2) 50068
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1420040
81.3%
Other Punctuation 327638
 
18.7%
Dash Punctuation 11
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 651948
45.9%
1 178996
 
12.6%
2 118924
 
8.4%
5 96009
 
6.8%
3 85495
 
6.0%
4 68593
 
4.8%
6 61434
 
4.3%
7 55678
 
3.9%
8 52906
 
3.7%
9 50057
 
3.5%
Other Punctuation
ValueCountFrequency (%)
. 327638
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1747689
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 651948
37.3%
. 327638
18.7%
1 178996
 
10.2%
2 118924
 
6.8%
5 96009
 
5.5%
3 85495
 
4.9%
4 68593
 
3.9%
6 61434
 
3.5%
7 55678
 
3.2%
8 52906
 
3.0%
Other values (2) 50068
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1747689
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 651948
37.3%
. 327638
18.7%
1 178996
 
10.2%
2 118924
 
6.8%
5 96009
 
5.5%
3 85495
 
4.9%
4 68593
 
3.9%
6 61434
 
3.5%
7 55678
 
3.2%
8 52906
 
3.0%
Other values (2) 50068
 
2.9%

verbatimElevation
Text

Missing 

Distinct3250
Distinct (%)2.9%
Missing3703697
Missing (%)97.1%
Memory size29.1 MiB
2025-01-14T11:39:14.715506image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length152
Median length124
Mean length7.486739371
Min length1

Characters and Unicode

Total characters826551
Distinct characters79
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique807 ?
Unique (%)0.7%

Sample

1st row3600 (3440-3760) ft
2nd row~1800 ft.
3rd row80 ft
4th row160 m
5th row150 m
ValueCountFrequency (%)
ft 79883
34.3%
m 25814
 
11.1%
ca 5656
 
2.4%
feet 1786
 
0.8%
200 1755
 
0.8%
1100-1350 1649
 
0.7%
10 1423
 
0.6%
20 1246
 
0.5%
3500 1175
 
0.5%
3400 1167
 
0.5%
Other values (2139) 111547
47.9%
2025-01-14T11:39:14.995320image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 164251
19.9%
122699
14.8%
t 85121
10.3%
f 82821
10.0%
1 43159
 
5.2%
3 41182
 
5.0%
2 39236
 
4.7%
4 35494
 
4.3%
5 33415
 
4.0%
m 27534
 
3.3%
Other values (69) 151639
18.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 429592
52.0%
Lowercase Letter 249040
30.1%
Space Separator 122699
 
14.8%
Dash Punctuation 12951
 
1.6%
Other Punctuation 8620
 
1.0%
Uppercase Letter 2190
 
0.3%
Open Punctuation 636
 
0.1%
Close Punctuation 636
 
0.1%
Math Symbol 187
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 85121
34.2%
f 82821
33.3%
m 27534
 
11.1%
e 12456
 
5.0%
a 9784
 
3.9%
c 6790
 
2.7%
s 4113
 
1.7%
l 3680
 
1.5%
o 3258
 
1.3%
r 2692
 
1.1%
Other values (15) 10791
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
D 611
27.9%
T 289
13.2%
P 240
 
11.0%
W 220
 
10.0%
R 188
 
8.6%
A 168
 
7.7%
C 93
 
4.2%
N 61
 
2.8%
G 52
 
2.4%
S 37
 
1.7%
Other values (13) 231
 
10.5%
Decimal Number
ValueCountFrequency (%)
0 164251
38.2%
1 43159
 
10.0%
3 41182
 
9.6%
2 39236
 
9.1%
4 35494
 
8.3%
5 33415
 
7.8%
6 25083
 
5.8%
8 19351
 
4.5%
7 16238
 
3.8%
9 12183
 
2.8%
Other Punctuation
ValueCountFrequency (%)
. 7357
85.3%
: 599
 
6.9%
' 280
 
3.2%
, 249
 
2.9%
" 50
 
0.6%
? 45
 
0.5%
; 21
 
0.2%
/ 11
 
0.1%
& 7
 
0.1%
1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
< 96
51.3%
+ 32
 
17.1%
= 30
 
16.0%
> 17
 
9.1%
~ 12
 
6.4%
Open Punctuation
ValueCountFrequency (%)
( 580
91.2%
[ 56
 
8.8%
Close Punctuation
ValueCountFrequency (%)
) 580
91.2%
] 56
 
8.8%
Space Separator
ValueCountFrequency (%)
122699
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12951
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 575321
69.6%
Latin 251230
30.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 85121
33.9%
f 82821
33.0%
m 27534
 
11.0%
e 12456
 
5.0%
a 9784
 
3.9%
c 6790
 
2.7%
s 4113
 
1.6%
l 3680
 
1.5%
o 3258
 
1.3%
r 2692
 
1.1%
Other values (38) 12981
 
5.2%
Common
ValueCountFrequency (%)
0 164251
28.5%
122699
21.3%
1 43159
 
7.5%
3 41182
 
7.2%
2 39236
 
6.8%
4 35494
 
6.2%
5 33415
 
5.8%
6 25083
 
4.4%
8 19351
 
3.4%
7 16238
 
2.8%
Other values (21) 35213
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 826550
> 99.9%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 164251
19.9%
122699
14.8%
t 85121
10.3%
f 82821
10.0%
1 43159
 
5.2%
3 41182
 
5.0%
2 39236
 
4.7%
4 35494
 
4.3%
5 33415
 
4.0%
m 27534
 
3.3%
Other values (68) 151638
18.3%
Punctuation
ValueCountFrequency (%)
1
100.0%

minimumDepthInMeters
Text

Missing 

Distinct5449
Distinct (%)1.3%
Missing3390497
Missing (%)88.9%
Memory size29.1 MiB
2025-01-14T11:39:15.204165image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length4.186890052
Min length3

Characters and Unicode

Total characters1773575
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1641 ?
Unique (%)0.4%

Sample

1st row9.1
2nd row200.0
3rd row3200.0
4th row30.0
5th row844.0
ValueCountFrequency (%)
0.0 50928
 
12.0%
1.0 10648
 
2.5%
3.0 9584
 
2.3%
2.0 8926
 
2.1%
15.0 7397
 
1.7%
18.0 5876
 
1.4%
9.0 5651
 
1.3%
27.0 4728
 
1.1%
37.0 4372
 
1.0%
5.0 4195
 
1.0%
Other values (5437) 311297
73.5%
2025-01-14T11:39:15.475712image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 547302
30.9%
. 423602
23.9%
1 159203
 
9.0%
2 119170
 
6.7%
5 94176
 
5.3%
3 91862
 
5.2%
4 80810
 
4.6%
8 70149
 
4.0%
6 67703
 
3.8%
7 61571
 
3.5%
Other values (2) 58027
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1349952
76.1%
Other Punctuation 423602
 
23.9%
Dash Punctuation 21
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 547302
40.5%
1 159203
 
11.8%
2 119170
 
8.8%
5 94176
 
7.0%
3 91862
 
6.8%
4 80810
 
6.0%
8 70149
 
5.2%
6 67703
 
5.0%
7 61571
 
4.6%
9 58006
 
4.3%
Other Punctuation
ValueCountFrequency (%)
. 423602
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1773575
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 547302
30.9%
. 423602
23.9%
1 159203
 
9.0%
2 119170
 
6.7%
5 94176
 
5.3%
3 91862
 
5.2%
4 80810
 
4.6%
8 70149
 
4.0%
6 67703
 
3.8%
7 61571
 
3.5%
Other values (2) 58027
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1773575
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 547302
30.9%
. 423602
23.9%
1 159203
 
9.0%
2 119170
 
6.7%
5 94176
 
5.3%
3 91862
 
5.2%
4 80810
 
4.6%
8 70149
 
4.0%
6 67703
 
3.8%
7 61571
 
3.5%
Other values (2) 58027
 
3.3%

maximumDepthInMeters
Text

Missing 

Distinct5288
Distinct (%)1.4%
Missing3423246
Missing (%)89.8%
Memory size29.1 MiB
2025-01-14T11:39:15.696014image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length4.294187329
Min length3

Characters and Unicode

Total characters1678396
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1538 ?
Unique (%)0.4%

Sample

1st row9.1
2nd row200.0
3rd row3200.0
4th row50.0
5th row804.0
ValueCountFrequency (%)
1.0 18208
 
4.7%
2.0 8861
 
2.3%
3.0 8173
 
2.1%
5.0 7349
 
1.9%
9.0 5992
 
1.5%
15.0 5745
 
1.5%
18.0 5657
 
1.4%
6.0 5297
 
1.4%
27.0 5260
 
1.3%
0.0 4935
 
1.3%
Other values (5276) 315376
80.7%
2025-01-14T11:39:16.005646image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 473419
28.2%
. 390853
23.3%
1 171730
 
10.2%
2 121541
 
7.2%
5 96119
 
5.7%
3 89936
 
5.4%
4 79728
 
4.8%
8 68887
 
4.1%
6 67913
 
4.0%
7 61466
 
3.7%
Other values (2) 56804
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1287522
76.7%
Other Punctuation 390853
 
23.3%
Dash Punctuation 21
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 473419
36.8%
1 171730
 
13.3%
2 121541
 
9.4%
5 96119
 
7.5%
3 89936
 
7.0%
4 79728
 
6.2%
8 68887
 
5.4%
6 67913
 
5.3%
7 61466
 
4.8%
9 56783
 
4.4%
Other Punctuation
ValueCountFrequency (%)
. 390853
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1678396
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 473419
28.2%
. 390853
23.3%
1 171730
 
10.2%
2 121541
 
7.2%
5 96119
 
5.7%
3 89936
 
5.4%
4 79728
 
4.8%
8 68887
 
4.1%
6 67913
 
4.0%
7 61466
 
3.7%
Other values (2) 56804
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1678396
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 473419
28.2%
. 390853
23.3%
1 171730
 
10.2%
2 121541
 
7.2%
5 96119
 
5.7%
3 89936
 
5.4%
4 79728
 
4.8%
8 68887
 
4.1%
6 67913
 
4.0%
7 61466
 
3.7%
Other values (2) 56804
 
3.4%

verbatimDepth
Text

Missing 

Distinct1114
Distinct (%)4.8%
Missing3790849
Missing (%)99.4%
Memory size29.1 MiB
2025-01-14T11:39:16.209666image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length147466
Median length91
Mean length15.05470968
Min length1

Characters and Unicode

Total characters350022
Distinct characters100
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique551 ?
Unique (%)2.4%

Sample

1st rowLittoral
2nd row00000000, 00000013
3rd rowpenetration depth: 15cm
4th row1 ms ca.
5th rowIntertidal
ValueCountFrequency (%)
ca 10580
21.9%
intertidal 4974
 
10.3%
surface 2615
 
5.4%
depths 1198
 
2.5%
recorded 1194
 
2.5%
multiple 1187
 
2.5%
depth 796
 
1.6%
shore 499
 
1.0%
at 486
 
1.0%
0-300 481
 
1.0%
Other values (4738) 24276
50.3%
2025-01-14T11:39:16.493734image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35104
 
10.0%
a 27031
 
7.7%
e 23974
 
6.8%
t 21796
 
6.2%
18475
 
5.3%
c 16923
 
4.8%
r 16167
 
4.6%
i 14121
 
4.0%
d 13514
 
3.9%
l 12486
 
3.6%
Other values (90) 150431
43.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 203573
58.2%
Decimal Number 42845
 
12.2%
Control 35290
 
10.1%
Uppercase Letter 22929
 
6.6%
Other Punctuation 21619
 
6.2%
Space Separator 18475
 
5.3%
Dash Punctuation 4705
 
1.3%
Math Symbol 221
 
0.1%
Open Punctuation 184
 
0.1%
Close Punctuation 180
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 27031
13.3%
e 23974
11.8%
t 21796
10.7%
c 16923
8.3%
r 16167
7.9%
i 14121
 
6.9%
d 13514
 
6.6%
l 12486
 
6.1%
n 11247
 
5.5%
o 8998
 
4.4%
Other values (28) 37316
18.3%
Uppercase Letter
ValueCountFrequency (%)
I 4794
20.9%
S 4128
18.0%
A 2642
11.5%
M 2208
9.6%
C 2204
9.6%
N 1087
 
4.7%
P 851
 
3.7%
U 672
 
2.9%
L 573
 
2.5%
E 490
 
2.1%
Other values (16) 3280
14.3%
Other Punctuation
ValueCountFrequency (%)
. 11298
52.3%
, 4253
 
19.7%
: 3616
 
16.7%
/ 1434
 
6.6%
" 380
 
1.8%
' 255
 
1.2%
; 189
 
0.9%
? 114
 
0.5%
& 58
 
0.3%
@ 16
 
0.1%
Other values (3) 6
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 12422
29.0%
1 5730
13.4%
2 4152
 
9.7%
3 3884
 
9.1%
5 3261
 
7.6%
8 3129
 
7.3%
6 2893
 
6.8%
4 2705
 
6.3%
7 2371
 
5.5%
9 2298
 
5.4%
Math Symbol
ValueCountFrequency (%)
= 136
61.5%
< 60
27.1%
+ 13
 
5.9%
~ 12
 
5.4%
Control
ValueCountFrequency (%)
35104
99.5%
186
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 179
97.3%
[ 5
 
2.7%
Close Punctuation
ValueCountFrequency (%)
) 175
97.2%
] 5
 
2.8%
Space Separator
ValueCountFrequency (%)
18475
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4705
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 226502
64.7%
Common 123520
35.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 27031
11.9%
e 23974
 
10.6%
t 21796
 
9.6%
c 16923
 
7.5%
r 16167
 
7.1%
i 14121
 
6.2%
d 13514
 
6.0%
l 12486
 
5.5%
n 11247
 
5.0%
o 8998
 
4.0%
Other values (54) 60245
26.6%
Common
ValueCountFrequency (%)
35104
28.4%
18475
15.0%
0 12422
 
10.1%
. 11298
 
9.1%
1 5730
 
4.6%
- 4705
 
3.8%
, 4253
 
3.4%
2 4152
 
3.4%
3 3884
 
3.1%
: 3616
 
2.9%
Other values (26) 19881
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 349964
> 99.9%
None 58
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
35104
 
10.0%
a 27031
 
7.7%
e 23974
 
6.9%
t 21796
 
6.2%
18475
 
5.3%
c 16923
 
4.8%
r 16167
 
4.6%
i 14121
 
4.0%
d 13514
 
3.9%
l 12486
 
3.6%
Other values (78) 150373
43.0%
None
ValueCountFrequency (%)
í 12
20.7%
ó 9
15.5%
á 8
13.8%
ü 7
12.1%
é 7
12.1%
ô 6
10.3%
ñ 2
 
3.4%
ä 2
 
3.4%
ã 2
 
3.4%
ø 1
 
1.7%
Other values (2) 2
 
3.4%

locationRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:16.556106image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length16
Mean length16
Min length16

Characters and Unicode

Total characters16
Distinct characters12
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowDeFilipps, R. A.
ValueCountFrequency (%)
defilipps 1
33.3%
r 1
33.3%
a 1
33.3%
2025-01-14T11:39:16.673580image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2
12.5%
p 2
12.5%
2
12.5%
. 2
12.5%
D 1
6.2%
e 1
6.2%
F 1
6.2%
l 1
6.2%
s 1
6.2%
, 1
6.2%
Other values (2) 2
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
43.8%
Uppercase Letter 4
25.0%
Other Punctuation 3
18.8%
Space Separator 2
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2
28.6%
p 2
28.6%
e 1
14.3%
l 1
14.3%
s 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
D 1
25.0%
F 1
25.0%
R 1
25.0%
A 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 2
66.7%
, 1
33.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11
68.8%
Common 5
31.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 2
18.2%
p 2
18.2%
D 1
9.1%
e 1
9.1%
F 1
9.1%
l 1
9.1%
s 1
9.1%
R 1
9.1%
A 1
9.1%
Common
ValueCountFrequency (%)
2
40.0%
. 2
40.0%
, 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2
12.5%
p 2
12.5%
2
12.5%
. 2
12.5%
D 1
6.2%
e 1
6.2%
F 1
6.2%
l 1
6.2%
s 1
6.2%
, 1
6.2%
Other values (2) 2
12.5%

decimalLatitude
Text

Missing 

Distinct119398
Distinct (%)10.4%
Missing2665103
Missing (%)69.9%
Memory size29.1 MiB
2025-01-14T11:39:16.932602image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length10
Mean length6.153433955
Min length3

Characters and Unicode

Total characters7070271
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49829 ?
Unique (%)4.3%

Sample

1st row16.8033
2nd row38.9361
3rd row29.2483
4th row44.8831
5th row29.2586
ValueCountFrequency (%)
25.58 4259
 
0.4%
40.6583 3632
 
0.3%
26.17 3044
 
0.3%
26.5 2214
 
0.2%
39.6891 2124
 
0.2%
38.9694 1853
 
0.2%
39.6306 1749
 
0.2%
38.895 1685
 
0.1%
26.97 1656
 
0.1%
60.75 1583
 
0.1%
Other values (110672) 1125197
97.9%
2025-01-14T11:39:17.270535image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 1148996
16.3%
3 918831
13.0%
2 629781
8.9%
1 613004
8.7%
5 564550
8.0%
8 556515
7.9%
7 544946
7.7%
4 530325
7.5%
6 524922
7.4%
9 445804
 
6.3%
Other values (3) 592597
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5698095
80.6%
Other Punctuation 1148996
 
16.3%
Dash Punctuation 223153
 
3.2%
Uppercase Letter 27
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 918831
16.1%
2 629781
11.1%
1 613004
10.8%
5 564550
9.9%
8 556515
9.8%
7 544946
9.6%
4 530325
9.3%
6 524922
9.2%
9 445804
7.8%
0 369417
6.5%
Other Punctuation
ValueCountFrequency (%)
. 1148996
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 223153
100.0%
Uppercase Letter
ValueCountFrequency (%)
E 27
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7070244
> 99.9%
Latin 27
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
. 1148996
16.3%
3 918831
13.0%
2 629781
8.9%
1 613004
8.7%
5 564550
8.0%
8 556515
7.9%
7 544946
7.7%
4 530325
7.5%
6 524922
7.4%
9 445804
 
6.3%
Other values (2) 592570
8.4%
Latin
ValueCountFrequency (%)
E 27
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7070271
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 1148996
16.3%
3 918831
13.0%
2 629781
8.9%
1 613004
8.7%
5 564550
8.0%
8 556515
7.9%
7 544946
7.7%
4 530325
7.5%
6 524922
7.4%
9 445804
 
6.3%
Other values (3) 592597
8.4%

decimalLongitude
Text

Missing 

Distinct124298
Distinct (%)10.8%
Missing2665103
Missing (%)69.9%
Memory size29.1 MiB
2025-01-14T11:39:17.521229image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length7.019817301
Min length3

Characters and Unicode

Total characters8065742
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49181 ?
Unique (%)4.3%

Sample

1st row-88.0767
2nd row-79.6908
3rd row-88.1214
4th row-68.672
5th row-94.9533
ValueCountFrequency (%)
80.1 4295
 
0.4%
105.644 2150
 
0.2%
127.848 1835
 
0.2%
77.4714 1749
 
0.2%
88.08 1737
 
0.2%
67.7683 1710
 
0.1%
77.0367 1651
 
0.1%
139.5 1588
 
0.1%
80.13 1583
 
0.1%
77.1767 1529
 
0.1%
Other values (114404) 1129169
98.3%
2025-01-14T11:39:17.830361image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 1148996
14.2%
- 937099
11.6%
7 848574
10.5%
1 775389
9.6%
8 717466
8.9%
6 625080
7.7%
3 602433
7.5%
5 548904
6.8%
2 527202
6.5%
9 481545
6.0%
Other values (2) 853054
10.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5979647
74.1%
Other Punctuation 1148996
 
14.2%
Dash Punctuation 937099
 
11.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 848574
14.2%
1 775389
13.0%
8 717466
12.0%
6 625080
10.5%
3 602433
10.1%
5 548904
9.2%
2 527202
8.8%
9 481545
8.1%
4 436297
7.3%
0 416757
7.0%
Other Punctuation
ValueCountFrequency (%)
. 1148996
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 937099
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8065742
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 1148996
14.2%
- 937099
11.6%
7 848574
10.5%
1 775389
9.6%
8 717466
8.9%
6 625080
7.7%
3 602433
7.5%
5 548904
6.8%
2 527202
6.5%
9 481545
6.0%
Other values (2) 853054
10.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8065742
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 1148996
14.2%
- 937099
11.6%
7 848574
10.5%
1 775389
9.6%
8 717466
8.9%
6 625080
7.7%
3 602433
7.5%
5 548904
6.8%
2 527202
6.5%
9 481545
6.0%
Other values (2) 853054
10.6%

geodeticDatum
Text

Missing 

Distinct32
Distinct (%)< 0.1%
Missing3696977
Missing (%)96.9%
Memory size29.1 MiB
2025-01-14T11:39:17.909593image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length28
Median length5
Mean length8.18266423
Min length3

Characters and Unicode

Total characters958370
Distinct characters49
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st rowWGS84
2nd rowWGS84
3rd rowWGS84
4th rowNAD27
5th rowWGS84
ValueCountFrequency (%)
wgs84 65622
37.8%
84 25578
 
14.7%
wgs 25577
 
14.7%
epsg:4326 25144
 
14.5%
nad27 13658
 
7.9%
nad83 3959
 
2.3%
prp_m 3499
 
2.0%
not 2459
 
1.4%
recorded 2459
 
1.4%
agd66 947
 
0.5%
Other values (32) 4709
 
2.7%
2025-01-14T11:39:18.055503image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
G 118860
12.4%
4 117158
12.2%
S 116960
12.2%
8 95317
9.9%
W 91309
 
9.5%
56489
 
5.9%
2 40059
 
4.2%
P 32649
 
3.4%
3 29150
 
3.0%
6 27541
 
2.9%
Other values (39) 232878
24.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 451198
47.1%
Decimal Number 326062
34.0%
Space Separator 56489
 
5.9%
Lowercase Letter 44172
 
4.6%
Close Punctuation 25648
 
2.7%
Open Punctuation 25648
 
2.7%
Other Punctuation 25648
 
2.7%
Connector Punctuation 3499
 
0.4%
Dash Punctuation 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 9127
20.7%
d 6131
13.9%
o 5896
13.3%
t 4412
10.0%
r 4252
9.6%
c 3436
 
7.8%
n 2902
 
6.6%
a 2759
 
6.2%
u 1214
 
2.7%
m 977
 
2.2%
Other values (8) 3066
 
6.9%
Uppercase Letter
ValueCountFrequency (%)
G 118860
26.3%
S 116960
25.9%
W 91309
20.2%
P 32649
 
7.2%
E 25647
 
5.7%
D 19535
 
4.3%
A 18803
 
4.2%
N 18412
 
4.1%
R 4240
 
0.9%
M 3499
 
0.8%
Other values (4) 1284
 
0.3%
Decimal Number
ValueCountFrequency (%)
4 117158
35.9%
8 95317
29.2%
2 40059
 
12.3%
3 29150
 
8.9%
6 27541
 
8.4%
7 13763
 
4.2%
0 2324
 
0.7%
9 604
 
0.2%
1 77
 
< 0.1%
5 69
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
: 25647
> 99.9%
/ 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
56489
100.0%
Close Punctuation
ValueCountFrequency (%)
) 25648
100.0%
Open Punctuation
ValueCountFrequency (%)
( 25648
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3499
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 495370
51.7%
Common 463000
48.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 118860
24.0%
S 116960
23.6%
W 91309
18.4%
P 32649
 
6.6%
E 25647
 
5.2%
D 19535
 
3.9%
A 18803
 
3.8%
N 18412
 
3.7%
e 9127
 
1.8%
d 6131
 
1.2%
Other values (22) 37937
 
7.7%
Common
ValueCountFrequency (%)
4 117158
25.3%
8 95317
20.6%
56489
12.2%
2 40059
 
8.7%
3 29150
 
6.3%
6 27541
 
5.9%
) 25648
 
5.5%
( 25648
 
5.5%
: 25647
 
5.5%
7 13763
 
3.0%
Other values (7) 6580
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 958370
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
G 118860
12.4%
4 117158
12.2%
S 116960
12.2%
8 95317
9.9%
W 91309
 
9.5%
56489
 
5.9%
2 40059
 
4.2%
P 32649
 
3.4%
3 29150
 
3.0%
6 27541
 
2.9%
Other values (39) 232878
24.3%
Distinct6505
Distinct (%)9.4%
Missing3744590
Missing (%)98.2%
Memory size29.1 MiB
2025-01-14T11:39:18.274303image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length7
Mean length5.591376656
Min length1

Characters and Unicode

Total characters388651
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2288 ?
Unique (%)3.3%

Sample

1st row401.569
2nd row3246
3rd row3429.51
4th row801.569
5th row4233
ValueCountFrequency (%)
3036 736
 
1.1%
100 596
 
0.9%
347.618 587
 
0.8%
500 567
 
0.8%
16000 557
 
0.8%
186.684 539
 
0.8%
1000 538
 
0.8%
4615 493
 
0.7%
1066 433
 
0.6%
5615 430
 
0.6%
Other values (6495) 64033
92.1%
2025-01-14T11:39:18.567146image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 52237
13.4%
2 37995
9.8%
0 37635
9.7%
3 37119
9.6%
. 36994
9.5%
5 36439
9.4%
4 34540
8.9%
6 32987
8.5%
9 28494
7.3%
8 27891
7.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 351657
90.5%
Other Punctuation 36994
 
9.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 52237
14.9%
2 37995
10.8%
0 37635
10.7%
3 37119
10.6%
5 36439
10.4%
4 34540
9.8%
6 32987
9.4%
9 28494
8.1%
8 27891
7.9%
7 26320
7.5%
Other Punctuation
ValueCountFrequency (%)
. 36994
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 388651
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 52237
13.4%
2 37995
9.8%
0 37635
9.7%
3 37119
9.6%
. 36994
9.5%
5 36439
9.4%
4 34540
8.9%
6 32987
8.5%
9 28494
7.3%
8 27891
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 388651
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 52237
13.4%
2 37995
9.8%
0 37635
9.7%
3 37119
9.6%
. 36994
9.5%
5 36439
9.4%
4 34540
8.9%
6 32987
8.5%
9 28494
7.3%
8 27891
7.2%

coordinatePrecision
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing3814096
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:18.628802image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.666666667
Min length2

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row10
2nd row153
3rd row239
ValueCountFrequency (%)
10 1
33.3%
153 1
33.3%
239 1
33.3%
2025-01-14T11:39:18.733755image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
25.0%
3 2
25.0%
0 1
12.5%
5 1
12.5%
2 1
12.5%
9 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
25.0%
3 2
25.0%
0 1
12.5%
5 1
12.5%
2 1
12.5%
9 1
12.5%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
25.0%
3 2
25.0%
0 1
12.5%
5 1
12.5%
2 1
12.5%
9 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
25.0%
3 2
25.0%
0 1
12.5%
5 1
12.5%
2 1
12.5%
9 1
12.5%

pointRadiusSpatialFit
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing3814095
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:18.786345image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length9
Mean length5.75
Min length2

Characters and Unicode

Total characters23
Distinct characters19
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st row10
2nd row153
3rd row239
4th rowFluminicola sp.
ValueCountFrequency (%)
10 1
20.0%
153 1
20.0%
239 1
20.0%
fluminicola 1
20.0%
sp 1
20.0%
2025-01-14T11:39:18.900457image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
 
8.7%
3 2
 
8.7%
l 2
 
8.7%
i 2
 
8.7%
n 1
 
4.3%
p 1
 
4.3%
s 1
 
4.3%
1
 
4.3%
a 1
 
4.3%
o 1
 
4.3%
Other values (9) 9
39.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12
52.2%
Decimal Number 8
34.8%
Space Separator 1
 
4.3%
Uppercase Letter 1
 
4.3%
Other Punctuation 1
 
4.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 2
16.7%
i 2
16.7%
n 1
8.3%
p 1
8.3%
s 1
8.3%
a 1
8.3%
o 1
8.3%
c 1
8.3%
m 1
8.3%
u 1
8.3%
Decimal Number
ValueCountFrequency (%)
1 2
25.0%
3 2
25.0%
0 1
12.5%
9 1
12.5%
2 1
12.5%
5 1
12.5%
Space Separator
ValueCountFrequency (%)
1
100.0%
Uppercase Letter
ValueCountFrequency (%)
F 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13
56.5%
Common 10
43.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 2
15.4%
i 2
15.4%
n 1
7.7%
p 1
7.7%
s 1
7.7%
a 1
7.7%
o 1
7.7%
c 1
7.7%
m 1
7.7%
u 1
7.7%
Common
ValueCountFrequency (%)
1 2
20.0%
3 2
20.0%
1
10.0%
0 1
10.0%
9 1
10.0%
2 1
10.0%
5 1
10.0%
. 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
 
8.7%
3 2
 
8.7%
l 2
 
8.7%
i 2
 
8.7%
n 1
 
4.3%
p 1
 
4.3%
s 1
 
4.3%
1
 
4.3%
a 1
 
4.3%
o 1
 
4.3%
Other values (9) 9
39.1%

verbatimCoordinates
Text

Missing 

Distinct6
Distinct (%)100.0%
Missing3814093
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:18.955585image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters24
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)100.0%

Sample

1st row1929
2nd row2003
3rd row1955
4th row1911
5th row1907
ValueCountFrequency (%)
1929 1
16.7%
2003 1
16.7%
1955 1
16.7%
1911 1
16.7%
1907 1
16.7%
1876 1
16.7%
2025-01-14T11:39:19.071702image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 7
29.2%
9 5
20.8%
0 3
12.5%
2 2
 
8.3%
5 2
 
8.3%
7 2
 
8.3%
3 1
 
4.2%
8 1
 
4.2%
6 1
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 24
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 7
29.2%
9 5
20.8%
0 3
12.5%
2 2
 
8.3%
5 2
 
8.3%
7 2
 
8.3%
3 1
 
4.2%
8 1
 
4.2%
6 1
 
4.2%

Most occurring scripts

ValueCountFrequency (%)
Common 24
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 7
29.2%
9 5
20.8%
0 3
12.5%
2 2
 
8.3%
5 2
 
8.3%
7 2
 
8.3%
3 1
 
4.2%
8 1
 
4.2%
6 1
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 7
29.2%
9 5
20.8%
0 3
12.5%
2 2
 
8.3%
5 2
 
8.3%
7 2
 
8.3%
3 1
 
4.2%
8 1
 
4.2%
6 1
 
4.2%

verbatimLatitude
Text

Missing 

Distinct44509
Distinct (%)13.9%
Missing3492892
Missing (%)91.6%
Memory size29.1 MiB
2025-01-14T11:39:19.258234image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length46331
Median length10
Mean length9.341891677
Min length1

Characters and Unicode

Total characters3000681
Distinct characters107
Distinct categories17 ?
Distinct scripts3 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19728 ?
Unique (%)6.1%

Sample

1st row38 56 10 N
2nd row44.883125
3rd row02 47 -- N
4th row37 58 10 N
5th row03 18.20' N
ValueCountFrequency (%)
n 202633
 
21.3%
60360
 
6.3%
s 37789
 
4.0%
35 24680
 
2.6%
38 19881
 
2.1%
39 18621
 
2.0%
37 17456
 
1.8%
36 15457
 
1.6%
10 11742
 
1.2%
00 11158
 
1.2%
Other values (27212) 532628
55.9%
2025-01-14T11:39:19.520156image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
627624
20.9%
3 283224
9.4%
0 248562
 
8.3%
N 235749
 
7.9%
2 217698
 
7.3%
1 210460
 
7.0%
5 188524
 
6.3%
4 185672
 
6.2%
- 142824
 
4.8%
8 115599
 
3.9%
Other values (97) 544745
18.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1759263
58.6%
Space Separator 627624
 
20.9%
Uppercase Letter 293587
 
9.8%
Dash Punctuation 142826
 
4.8%
Other Punctuation 104838
 
3.5%
Lowercase Letter 46817
 
1.6%
Control 20038
 
0.7%
Other Symbol 5019
 
0.2%
Other Letter 237
 
< 0.1%
Connector Punctuation 82
 
< 0.1%
Other values (7) 350
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 6860
14.7%
a 4780
10.2%
d 4754
10.2%
t 3460
 
7.4%
i 3459
 
7.4%
g 3071
 
6.6%
o 2771
 
5.9%
n 2580
 
5.5%
r 2476
 
5.3%
c 2090
 
4.5%
Other values (24) 10516
22.5%
Uppercase Letter
ValueCountFrequency (%)
N 235749
80.3%
S 52717
 
18.0%
L 617
 
0.2%
M 576
 
0.2%
A 514
 
0.2%
P 453
 
0.2%
U 361
 
0.1%
D 341
 
0.1%
E 281
 
0.1%
C 278
 
0.1%
Other values (16) 1700
 
0.6%
Other Punctuation
ValueCountFrequency (%)
. 78293
74.7%
' 14754
 
14.1%
" 4985
 
4.8%
; 3405
 
3.2%
: 1208
 
1.2%
, 885
 
0.8%
/ 821
 
0.8%
177
 
0.2%
? 155
 
0.1%
* 73
 
0.1%
Other values (5) 82
 
0.1%
Decimal Number
ValueCountFrequency (%)
3 283224
16.1%
0 248562
14.1%
2 217698
12.4%
1 210460
12.0%
5 188524
10.7%
4 185672
10.6%
8 115599
6.6%
9 108329
 
6.2%
7 102597
 
5.8%
6 98598
 
5.6%
Open Punctuation
ValueCountFrequency (%)
( 51
65.4%
[ 26
33.3%
{ 1
 
1.3%
Close Punctuation
ValueCountFrequency (%)
) 50
64.9%
] 26
33.8%
} 1
 
1.3%
Dash Punctuation
ValueCountFrequency (%)
- 142824
> 99.9%
2
 
< 0.1%
Control
ValueCountFrequency (%)
19932
99.5%
106
 
0.5%
Other Symbol
ValueCountFrequency (%)
° 5018
> 99.9%
1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 67
93.1%
~ 5
 
6.9%
Modifier Symbol
ValueCountFrequency (%)
´ 35
74.5%
˚ 12
 
25.5%
Space Separator
ValueCountFrequency (%)
627624
100.0%
Other Letter
ValueCountFrequency (%)
º 237
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 82
100.0%
Final Punctuation
ValueCountFrequency (%)
50
100.0%
Modifier Letter
ValueCountFrequency (%)
ʹ 25
100.0%
Nonspacing Mark
ValueCountFrequency (%)
̊ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2660039
88.6%
Latin 340641
 
11.4%
Inherited 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 235749
69.2%
S 52717
 
15.5%
e 6860
 
2.0%
a 4780
 
1.4%
d 4754
 
1.4%
t 3460
 
1.0%
i 3459
 
1.0%
g 3071
 
0.9%
o 2771
 
0.8%
n 2580
 
0.8%
Other values (51) 20440
 
6.0%
Common
ValueCountFrequency (%)
627624
23.6%
3 283224
10.6%
0 248562
 
9.3%
2 217698
 
8.2%
1 210460
 
7.9%
5 188524
 
7.1%
4 185672
 
7.0%
- 142824
 
5.4%
8 115599
 
4.3%
9 108329
 
4.1%
Other values (35) 331523
12.5%
Inherited
ValueCountFrequency (%)
̊ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2995041
99.8%
None 5318
 
0.2%
Punctuation 283
 
< 0.1%
Modifier Letters 37
 
< 0.1%
Diacriticals 1
 
< 0.1%
Geometric Shapes 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
627624
21.0%
3 283224
9.5%
0 248562
 
8.3%
N 235749
 
7.9%
2 217698
 
7.3%
1 210460
 
7.0%
5 188524
 
6.3%
4 185672
 
6.2%
- 142824
 
4.8%
8 115599
 
3.9%
Other values (78) 539105
18.0%
None
ValueCountFrequency (%)
° 5018
94.4%
º 237
 
4.5%
´ 35
 
0.7%
á 6
 
0.1%
é 6
 
0.1%
ô 4
 
0.1%
í 4
 
0.1%
ó 3
 
0.1%
ü 2
 
< 0.1%
ç 2
 
< 0.1%
Punctuation
ValueCountFrequency (%)
177
62.5%
54
 
19.1%
50
 
17.7%
2
 
0.7%
Modifier Letters
ValueCountFrequency (%)
ʹ 25
67.6%
˚ 12
32.4%
Diacriticals
ValueCountFrequency (%)
̊ 1
100.0%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%

verbatimLongitude
Text

Missing 

Distinct46810
Distinct (%)14.6%
Missing3493424
Missing (%)91.6%
Memory size29.1 MiB
2025-01-14T11:39:19.713553image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length68
Median length11
Mean length9.96988228
Min length1

Characters and Unicode

Total characters3197092
Distinct characters67
Distinct categories16 ?
Distinct scripts3 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21397 ?
Unique (%)6.7%

Sample

1st row079 41 27 W
2nd row-68.671977
3rd row016 25 -- E
4th row076 55 55 W
5th row59 39.00' W
ValueCountFrequency (%)
w 186725
 
19.8%
60672
 
6.4%
e 53045
 
5.6%
083 13556
 
1.4%
30 9655
 
1.0%
00 9463
 
1.0%
077 9344
 
1.0%
080 8478
 
0.9%
081 8350
 
0.9%
076 7865
 
0.8%
Other values (26677) 577299
61.1%
2025-01-14T11:39:19.982928image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
623777
19.5%
0 408474
12.8%
1 243902
 
7.6%
W 213465
 
6.7%
3 189432
 
5.9%
2 188925
 
5.9%
5 186067
 
5.8%
7 185084
 
5.8%
8 180278
 
5.6%
4 174743
 
5.5%
Other values (57) 602945
18.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1995207
62.4%
Space Separator 623777
 
19.5%
Uppercase Letter 287430
 
9.0%
Dash Punctuation 171257
 
5.4%
Other Punctuation 102184
 
3.2%
Lowercase Letter 11674
 
0.4%
Other Symbol 5009
 
0.2%
Other Letter 232
 
< 0.1%
Connector Punctuation 82
 
< 0.1%
Close Punctuation 56
 
< 0.1%
Other values (6) 184
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 2999
25.7%
e 2981
25.5%
g 2925
25.1%
n 552
 
4.7%
o 520
 
4.5%
t 402
 
3.4%
i 392
 
3.4%
u 380
 
3.3%
r 293
 
2.5%
s 130
 
1.1%
Other values (7) 100
 
0.9%
Other Punctuation
ValueCountFrequency (%)
. 78769
77.1%
' 14571
 
14.3%
" 4974
 
4.9%
; 3378
 
3.3%
177
 
0.2%
? 92
 
0.1%
* 73
 
0.1%
54
 
0.1%
: 43
 
< 0.1%
, 36
 
< 0.1%
Other values (2) 17
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
W 213465
74.3%
E 72880
 
25.4%
L 522
 
0.2%
D 164
 
0.1%
S 115
 
< 0.1%
N 111
 
< 0.1%
G 82
 
< 0.1%
O 61
 
< 0.1%
M 27
 
< 0.1%
T 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 408474
20.5%
1 243902
12.2%
3 189432
9.5%
2 188925
9.5%
5 186067
9.3%
7 185084
9.3%
8 180278
9.0%
4 174743
8.8%
6 131878
 
6.6%
9 106424
 
5.3%
Dash Punctuation
ValueCountFrequency (%)
- 171255
> 99.9%
2
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 5008
> 99.9%
1
 
< 0.1%
Modifier Symbol
ValueCountFrequency (%)
´ 35
74.5%
˚ 12
 
25.5%
Close Punctuation
ValueCountFrequency (%)
) 35
62.5%
] 21
37.5%
Open Punctuation
ValueCountFrequency (%)
( 33
62.3%
[ 20
37.7%
Space Separator
ValueCountFrequency (%)
623777
100.0%
Other Letter
ValueCountFrequency (%)
º 232
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 82
100.0%
Final Punctuation
ValueCountFrequency (%)
55
100.0%
Modifier Letter
ValueCountFrequency (%)
ʹ 25
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%
Nonspacing Mark
ValueCountFrequency (%)
̊ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2897755
90.6%
Latin 299336
 
9.4%
Inherited 1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
623777
21.5%
0 408474
14.1%
1 243902
 
8.4%
3 189432
 
6.5%
2 188925
 
6.5%
5 186067
 
6.4%
7 185084
 
6.4%
8 180278
 
6.2%
4 174743
 
6.0%
- 171255
 
5.9%
Other values (27) 345818
11.9%
Latin
ValueCountFrequency (%)
W 213465
71.3%
E 72880
 
24.3%
d 2999
 
1.0%
e 2981
 
1.0%
g 2925
 
1.0%
n 552
 
0.2%
L 522
 
0.2%
o 520
 
0.2%
t 402
 
0.1%
i 392
 
0.1%
Other values (19) 1698
 
0.6%
Inherited
ValueCountFrequency (%)
̊ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3191490
99.8%
None 5275
 
0.2%
Punctuation 288
 
< 0.1%
Modifier Letters 37
 
< 0.1%
Diacriticals 1
 
< 0.1%
Geometric Shapes 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
623777
19.5%
0 408474
12.8%
1 243902
 
7.6%
W 213465
 
6.7%
3 189432
 
5.9%
2 188925
 
5.9%
5 186067
 
5.8%
7 185084
 
5.8%
8 180278
 
5.6%
4 174743
 
5.5%
Other values (46) 597343
18.7%
None
ValueCountFrequency (%)
° 5008
94.9%
º 232
 
4.4%
´ 35
 
0.7%
Punctuation
ValueCountFrequency (%)
177
61.5%
55
 
19.1%
54
 
18.8%
2
 
0.7%
Modifier Letters
ValueCountFrequency (%)
ʹ 25
67.6%
˚ 12
32.4%
Diacriticals
ValueCountFrequency (%)
̊ 1
100.0%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%
Distinct10
Distinct (%)< 0.1%
Missing3396655
Missing (%)89.1%
Memory size29.1 MiB
2025-01-14T11:39:20.047145image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length23
Mean length22.71909526
Min length3

Characters and Unicode

Total characters9483950
Distinct characters36
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowDegrees Minutes Seconds
2nd rowDegrees Minutes Seconds
3rd rowDegrees Minutes Seconds
4th rowDegrees Minutes Seconds
5th rowDegrees Minutes Seconds
ValueCountFrequency (%)
degrees 413797
33.4%
minutes 403867
32.6%
seconds 403867
32.6%
decimal 9930
 
0.8%
township 2873
 
0.2%
range 2873
 
0.2%
utm 296
 
< 0.1%
marsden 232
 
< 0.1%
square 232
 
< 0.1%
unknown 232
 
< 0.1%
Other values (6) 20
 
< 0.1%
2025-01-14T11:39:20.170570image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2062392
21.7%
s 1224636
12.9%
820775
 
8.7%
n 814408
 
8.6%
g 416670
 
4.4%
i 416670
 
4.4%
r 414261
 
4.4%
d 414016
 
4.4%
D 413822
 
4.4%
c 413797
 
4.4%
Other values (26) 2072503
21.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7434230
78.4%
Uppercase Letter 1228922
 
13.0%
Space Separator 820775
 
8.7%
Decimal Number 21
 
< 0.1%
Other Punctuation 1
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2062392
27.7%
s 1224636
16.5%
n 814408
 
11.0%
g 416670
 
5.6%
i 416670
 
5.6%
r 414261
 
5.6%
d 414016
 
5.6%
c 413797
 
5.6%
o 406972
 
5.5%
u 404099
 
5.4%
Other values (9) 446309
 
6.0%
Uppercase Letter
ValueCountFrequency (%)
D 413822
33.7%
M 404395
32.9%
S 404099
32.9%
T 3169
 
0.3%
R 2873
 
0.2%
U 540
 
< 0.1%
A 12
 
< 0.1%
Q 12
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 15
71.4%
6 2
 
9.5%
2 1
 
4.8%
1 1
 
4.8%
8 1
 
4.8%
7 1
 
4.8%
Space Separator
ValueCountFrequency (%)
820775
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8663152
91.3%
Common 820798
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2062392
23.8%
s 1224636
14.1%
n 814408
 
9.4%
g 416670
 
4.8%
i 416670
 
4.8%
r 414261
 
4.8%
d 414016
 
4.8%
D 413822
 
4.8%
c 413797
 
4.8%
o 406972
 
4.7%
Other values (17) 1665508
19.2%
Common
ValueCountFrequency (%)
820775
> 99.9%
0 15
 
< 0.1%
6 2
 
< 0.1%
2 1
 
< 0.1%
. 1
 
< 0.1%
1 1
 
< 0.1%
8 1
 
< 0.1%
7 1
 
< 0.1%
- 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9483950
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2062392
21.7%
s 1224636
12.9%
820775
 
8.7%
n 814408
 
8.6%
g 416670
 
4.4%
i 416670
 
4.4%
r 414261
 
4.4%
d 414016
 
4.4%
D 413822
 
4.4%
c 413797
 
4.4%
Other values (26) 2072503
21.9%

verbatimSRS
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:20.220097image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length8
Mean length8
Min length6

Characters and Unicode

Total characters16
Distinct characters9
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row2700.0
2nd row1889-03-29
ValueCountFrequency (%)
2700.0 1
50.0%
1889-03-29 1
50.0%
2025-01-14T11:39:20.340834image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4
25.0%
2 2
12.5%
8 2
12.5%
9 2
12.5%
- 2
12.5%
7 1
 
6.2%
. 1
 
6.2%
1 1
 
6.2%
3 1
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 13
81.2%
Dash Punctuation 2
 
12.5%
Other Punctuation 1
 
6.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4
30.8%
2 2
15.4%
8 2
15.4%
9 2
15.4%
7 1
 
7.7%
1 1
 
7.7%
3 1
 
7.7%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4
25.0%
2 2
12.5%
8 2
12.5%
9 2
12.5%
- 2
12.5%
7 1
 
6.2%
. 1
 
6.2%
1 1
 
6.2%
3 1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4
25.0%
2 2
12.5%
8 2
12.5%
9 2
12.5%
- 2
12.5%
7 1
 
6.2%
. 1
 
6.2%
1 1
 
6.2%
3 1
 
6.2%

footprintSRS
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:20.397814image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length43
Median length22.5
Mean length22.5
Min length2

Characters and Unicode

Total characters45
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row88
2nd rowAnimalia, Mollusca, Gastropoda, Hydrobiidae
ValueCountFrequency (%)
88 1
20.0%
animalia 1
20.0%
mollusca 1
20.0%
gastropoda 1
20.0%
hydrobiidae 1
20.0%
2025-01-14T11:39:20.508501image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 6
13.3%
i 4
 
8.9%
o 4
 
8.9%
l 3
 
6.7%
, 3
 
6.7%
3
 
6.7%
d 3
 
6.7%
8 2
 
4.4%
s 2
 
4.4%
r 2
 
4.4%
Other values (13) 13
28.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 33
73.3%
Uppercase Letter 4
 
8.9%
Other Punctuation 3
 
6.7%
Space Separator 3
 
6.7%
Decimal Number 2
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6
18.2%
i 4
12.1%
o 4
12.1%
l 3
9.1%
d 3
9.1%
s 2
 
6.1%
r 2
 
6.1%
t 1
 
3.0%
b 1
 
3.0%
y 1
 
3.0%
Other values (6) 6
18.2%
Uppercase Letter
ValueCountFrequency (%)
H 1
25.0%
G 1
25.0%
A 1
25.0%
M 1
25.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Decimal Number
ValueCountFrequency (%)
8 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 37
82.2%
Common 8
 
17.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6
16.2%
i 4
 
10.8%
o 4
 
10.8%
l 3
 
8.1%
d 3
 
8.1%
s 2
 
5.4%
r 2
 
5.4%
t 1
 
2.7%
b 1
 
2.7%
y 1
 
2.7%
Other values (10) 10
27.0%
Common
ValueCountFrequency (%)
, 3
37.5%
3
37.5%
8 2
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6
13.3%
i 4
 
8.9%
o 4
 
8.9%
l 3
 
6.7%
, 3
 
6.7%
3
 
6.7%
d 3
 
6.7%
8 2
 
4.4%
s 2
 
4.4%
r 2
 
4.4%
Other values (13) 13
28.9%

footprintSpatialFit
Text

Missing 

Distinct8
Distinct (%)100.0%
Missing3814091
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:20.581650image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length16
Mean length13
Min length2

Characters and Unicode

Total characters104
Distinct characters27
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)100.0%

Sample

1st rowDrosera sp.
2nd row88
3rd rowMiconia coronata
4th rowBoerhavia diffusa
5th rowAnimalia
ValueCountFrequency (%)
drosera 1
 
7.1%
sp 1
 
7.1%
88 1
 
7.1%
miconia 1
 
7.1%
coronata 1
 
7.1%
boerhavia 1
 
7.1%
diffusa 1
 
7.1%
animalia 1
 
7.1%
myrcia 1
 
7.1%
splendens 1
 
7.1%
Other values (4) 4
28.6%
2025-01-14T11:39:20.718281image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 14
13.5%
i 10
 
9.6%
s 10
 
9.6%
r 8
 
7.7%
n 7
 
6.7%
e 6
 
5.8%
6
 
5.8%
o 5
 
4.8%
t 4
 
3.8%
c 4
 
3.8%
Other values (17) 30
28.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 88
84.6%
Uppercase Letter 7
 
6.7%
Space Separator 6
 
5.8%
Decimal Number 2
 
1.9%
Other Punctuation 1
 
1.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 14
15.9%
i 10
11.4%
s 10
11.4%
r 8
9.1%
n 7
8.0%
e 6
 
6.8%
o 5
 
5.7%
t 4
 
4.5%
c 4
 
4.5%
u 4
 
4.5%
Other values (9) 16
18.2%
Uppercase Letter
ValueCountFrequency (%)
B 2
28.6%
M 2
28.6%
D 1
14.3%
A 1
14.3%
C 1
14.3%
Space Separator
ValueCountFrequency (%)
6
100.0%
Decimal Number
ValueCountFrequency (%)
8 2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 95
91.3%
Common 9
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 14
14.7%
i 10
10.5%
s 10
10.5%
r 8
 
8.4%
n 7
 
7.4%
e 6
 
6.3%
o 5
 
5.3%
t 4
 
4.2%
c 4
 
4.2%
u 4
 
4.2%
Other values (14) 23
24.2%
Common
ValueCountFrequency (%)
6
66.7%
8 2
 
22.2%
. 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 104
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 14
13.5%
i 10
 
9.6%
s 10
 
9.6%
r 8
 
7.7%
n 7
 
6.7%
e 6
 
5.8%
6
 
5.8%
o 5
 
4.8%
t 4
 
3.8%
c 4
 
3.8%
Other values (17) 30
28.8%

georeferencedBy
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:20.767359image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length6
Mean length6
Min length4

Characters and Unicode

Total characters12
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row1889
2nd rowMollusca
ValueCountFrequency (%)
1889 1
50.0%
mollusca 1
50.0%
2025-01-14T11:39:21.002007image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 2
16.7%
l 2
16.7%
1 1
8.3%
9 1
8.3%
M 1
8.3%
o 1
8.3%
u 1
8.3%
s 1
8.3%
c 1
8.3%
a 1
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
58.3%
Decimal Number 4
33.3%
Uppercase Letter 1
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 2
28.6%
o 1
14.3%
u 1
14.3%
s 1
14.3%
c 1
14.3%
a 1
14.3%
Decimal Number
ValueCountFrequency (%)
8 2
50.0%
1 1
25.0%
9 1
25.0%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
66.7%
Common 4
33.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 2
25.0%
M 1
12.5%
o 1
12.5%
u 1
12.5%
s 1
12.5%
c 1
12.5%
a 1
12.5%
Common
ValueCountFrequency (%)
8 2
50.0%
1 1
25.0%
9 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 2
16.7%
l 2
16.7%
1 1
8.3%
9 1
8.3%
M 1
8.3%
o 1
8.3%
u 1
8.3%
s 1
8.3%
c 1
8.3%
a 1
8.3%

georeferencedDate
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:21.049523image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length5.5
Mean length5.5
Min length1

Characters and Unicode

Total characters11
Distinct characters9
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row3
2nd rowGastropoda
ValueCountFrequency (%)
3 1
50.0%
gastropoda 1
50.0%
2025-01-14T11:39:21.152857image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
18.2%
o 2
18.2%
3 1
9.1%
G 1
9.1%
s 1
9.1%
t 1
9.1%
r 1
9.1%
p 1
9.1%
d 1
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
81.8%
Decimal Number 1
 
9.1%
Uppercase Letter 1
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
22.2%
o 2
22.2%
s 1
11.1%
t 1
11.1%
r 1
11.1%
p 1
11.1%
d 1
11.1%
Decimal Number
ValueCountFrequency (%)
3 1
100.0%
Uppercase Letter
ValueCountFrequency (%)
G 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
90.9%
Common 1
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
20.0%
o 2
20.0%
G 1
10.0%
s 1
10.0%
t 1
10.0%
r 1
10.0%
p 1
10.0%
d 1
10.0%
Common
ValueCountFrequency (%)
3 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
18.2%
o 2
18.2%
3 1
9.1%
G 1
9.1%
s 1
9.1%
t 1
9.1%
r 1
9.1%
p 1
9.1%
d 1
9.1%

georeferenceProtocol
Text

Missing 

Distinct2782
Distinct (%)0.6%
Missing3320409
Missing (%)87.1%
Memory size29.1 MiB
2025-01-14T11:39:21.332868image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length302
Median length300
Mean length25.53193705
Min length2

Characters and Unicode

Total characters12604862
Distinct characters82
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique851 ?
Unique (%)0.2%

Sample

1st rowunknown, from legacy
2nd rowGEOLocate
3rd rowArcGIS software with data from New Mexico Resource Geographic Information System Program (http://rgis.unm.edu) and other inhouse resources (historical maps aiding with name changes), MaNIS/HerpNET/ORNIS Georeferencing Guidelines
4th rowGoogle Earth
5th rowAlexandria Digital Library Gazetteer, MaNIS/HerpNET/ORNIS Georeferencing Guidelines
ValueCountFrequency (%)
from 211518
 
13.0%
unknown 208911
 
12.8%
legacy 208034
 
12.7%
google 88871
 
5.4%
earth 64975
 
4.0%
geolocate 58622
 
3.6%
georeferencing 56351
 
3.5%
manis/herpnet/ornis 55304
 
3.4%
guidelines 55299
 
3.4%
gazetteer 32635
 
2.0%
Other values (3214) 592554
36.3%
2025-01-14T11:39:21.598183image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1139384
 
9.0%
e 1081446
 
8.6%
o 916290
 
7.3%
n 905258
 
7.2%
a 730763
 
5.8%
r 662416
 
5.3%
l 440906
 
3.5%
g 426015
 
3.4%
G 400771
 
3.2%
c 398496
 
3.2%
Other values (72) 5503117
43.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8535644
67.7%
Uppercase Letter 1879803
 
14.9%
Space Separator 1139384
 
9.0%
Other Punctuation 544960
 
4.3%
Decimal Number 396316
 
3.1%
Open Punctuation 39838
 
0.3%
Close Punctuation 39754
 
0.3%
Dash Punctuation 28917
 
0.2%
Math Symbol 138
 
< 0.1%
Connector Punctuation 108
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1081446
12.7%
o 916290
 
10.7%
n 905258
 
10.6%
a 730763
 
8.6%
r 662416
 
7.8%
l 440906
 
5.2%
g 426015
 
5.0%
c 398496
 
4.7%
u 341516
 
4.0%
i 334659
 
3.9%
Other values (17) 2297879
26.9%
Uppercase Letter
ValueCountFrequency (%)
G 400771
21.3%
S 220477
11.7%
N 208397
11.1%
E 184537
9.8%
I 137133
 
7.3%
O 120049
 
6.4%
M 110761
 
5.9%
T 106408
 
5.7%
L 77973
 
4.1%
R 62436
 
3.3%
Other values (17) 250861
13.3%
Other Punctuation
ValueCountFrequency (%)
, 325185
59.7%
/ 122278
 
22.4%
: 43075
 
7.9%
. 40103
 
7.4%
; 6297
 
1.2%
! 3660
 
0.7%
# 2675
 
0.5%
' 1093
 
0.2%
& 548
 
0.1%
? 40
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 180526
45.6%
2 62657
 
15.8%
1 56273
 
14.2%
4 32987
 
8.3%
5 16514
 
4.2%
9 11759
 
3.0%
7 10473
 
2.6%
6 10339
 
2.6%
3 9170
 
2.3%
8 5618
 
1.4%
Math Symbol
ValueCountFrequency (%)
+ 136
98.6%
= 2
 
1.4%
Space Separator
ValueCountFrequency (%)
1139384
100.0%
Open Punctuation
ValueCountFrequency (%)
( 39838
100.0%
Close Punctuation
ValueCountFrequency (%)
) 39754
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 28917
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 108
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10415447
82.6%
Common 2189415
 
17.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1081446
 
10.4%
o 916290
 
8.8%
n 905258
 
8.7%
a 730763
 
7.0%
r 662416
 
6.4%
l 440906
 
4.2%
g 426015
 
4.1%
G 400771
 
3.8%
c 398496
 
3.8%
u 341516
 
3.3%
Other values (44) 4111570
39.5%
Common
ValueCountFrequency (%)
1139384
52.0%
, 325185
 
14.9%
0 180526
 
8.2%
/ 122278
 
5.6%
2 62657
 
2.9%
1 56273
 
2.6%
: 43075
 
2.0%
. 40103
 
1.8%
( 39838
 
1.8%
) 39754
 
1.8%
Other values (18) 140342
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12603228
> 99.9%
None 1634
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1139384
 
9.0%
e 1081446
 
8.6%
o 916290
 
7.3%
n 905258
 
7.2%
a 730763
 
5.8%
r 662416
 
5.3%
l 440906
 
3.5%
g 426015
 
3.4%
G 400771
 
3.2%
c 398496
 
3.2%
Other values (70) 5501483
43.7%
None
ValueCountFrequency (%)
í 1633
99.9%
Î 1
 
0.1%

georeferenceSources
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:21.653804image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters11
Distinct characters8
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row29 Mar 1889
ValueCountFrequency (%)
29 1
33.3%
mar 1
33.3%
1889 1
33.3%
2025-01-14T11:39:21.756106image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 2
18.2%
2
18.2%
8 2
18.2%
2 1
9.1%
M 1
9.1%
a 1
9.1%
r 1
9.1%
1 1
9.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
54.5%
Space Separator 2
 
18.2%
Lowercase Letter 2
 
18.2%
Uppercase Letter 1
 
9.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 2
33.3%
8 2
33.3%
2 1
16.7%
1 1
16.7%
Lowercase Letter
ValueCountFrequency (%)
a 1
50.0%
r 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
72.7%
Latin 3
 
27.3%

Most frequent character per script

Common
ValueCountFrequency (%)
9 2
25.0%
2
25.0%
8 2
25.0%
2 1
12.5%
1 1
12.5%
Latin
ValueCountFrequency (%)
M 1
33.3%
a 1
33.3%
r 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 2
18.2%
2
18.2%
8 2
18.2%
2 1
9.1%
M 1
9.1%
a 1
9.1%
r 1
9.1%
1 1
9.1%

georeferenceRemarks
Text

Missing 

Distinct6364
Distinct (%)7.6%
Missing3730205
Missing (%)97.8%
Memory size29.1 MiB
2025-01-14T11:39:21.942552image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length182
Median length126
Mean length21.79519394
Min length1

Characters and Unicode

Total characters1828486
Distinct characters83
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3216 ?
Unique (%)3.8%

Sample

1st rowLocality extent = 400 m
2nd rowLocality extent = 0.6
3rd rowLocality extent = 1.059 mi.
4th rowLocality extent = 800 m
5th rowCoordinate Uncertainty In Meters: 44967
ValueCountFrequency (%)
locality 55561
16.6%
55391
16.6%
extent 55332
16.5%
mi 16479
 
4.9%
ca 7757
 
2.3%
km 4763
 
1.4%
approximate 4046
 
1.2%
in 3776
 
1.1%
coordinate 3445
 
1.0%
meters 3433
 
1.0%
Other values (6286) 124350
37.2%
2025-01-14T11:39:22.223864image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
250439
 
13.7%
t 206872
 
11.3%
e 156329
 
8.5%
a 99262
 
5.4%
i 95332
 
5.2%
o 87887
 
4.8%
n 86284
 
4.7%
l 68326
 
3.7%
c 64373
 
3.5%
. 63081
 
3.4%
Other values (73) 650301
35.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1145629
62.7%
Space Separator 250439
 
13.7%
Decimal Number 175491
 
9.6%
Uppercase Letter 127324
 
7.0%
Other Punctuation 72826
 
4.0%
Math Symbol 55339
 
3.0%
Dash Punctuation 866
 
< 0.1%
Open Punctuation 285
 
< 0.1%
Close Punctuation 285
 
< 0.1%
Initial Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 206872
18.1%
e 156329
13.6%
a 99262
8.7%
i 95332
8.3%
o 87887
7.7%
n 86284
7.5%
l 68326
 
6.0%
c 64373
 
5.6%
y 61729
 
5.4%
x 60958
 
5.3%
Other values (17) 158277
13.8%
Uppercase Letter
ValueCountFrequency (%)
L 55887
43.9%
C 14147
 
11.1%
A 8746
 
6.9%
M 4777
 
3.8%
I 4295
 
3.4%
G 4023
 
3.2%
P 3673
 
2.9%
U 3576
 
2.8%
D 3420
 
2.7%
S 3225
 
2.5%
Other values (16) 21555
 
16.9%
Decimal Number
ValueCountFrequency (%)
0 33298
19.0%
1 29145
16.6%
5 23499
13.4%
2 21435
12.2%
3 16974
9.7%
6 13385
7.6%
4 11027
 
6.3%
7 10595
 
6.0%
8 9404
 
5.4%
9 6729
 
3.8%
Other Punctuation
ValueCountFrequency (%)
. 63081
86.6%
: 3636
 
5.0%
, 2559
 
3.5%
; 2554
 
3.5%
/ 805
 
1.1%
' 154
 
0.2%
" 20
 
< 0.1%
& 9
 
< 0.1%
# 8
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 55316
> 99.9%
+ 22
 
< 0.1%
± 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 276
96.8%
[ 9
 
3.2%
Close Punctuation
ValueCountFrequency (%)
) 276
96.8%
] 9
 
3.2%
Space Separator
ValueCountFrequency (%)
250439
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 866
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1272953
69.6%
Common 555533
30.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 206872
16.3%
e 156329
12.3%
a 99262
 
7.8%
i 95332
 
7.5%
o 87887
 
6.9%
n 86284
 
6.8%
l 68326
 
5.4%
c 64373
 
5.1%
y 61729
 
4.8%
x 60958
 
4.8%
Other values (43) 285601
22.4%
Common
ValueCountFrequency (%)
250439
45.1%
. 63081
 
11.4%
= 55316
 
10.0%
0 33298
 
6.0%
1 29145
 
5.2%
5 23499
 
4.2%
2 21435
 
3.9%
3 16974
 
3.1%
6 13385
 
2.4%
4 11027
 
2.0%
Other values (20) 37934
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1828481
> 99.9%
None 3
 
< 0.1%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
250439
 
13.7%
t 206872
 
11.3%
e 156329
 
8.5%
a 99262
 
5.4%
i 95332
 
5.2%
o 87887
 
4.8%
n 86284
 
4.7%
l 68326
 
3.7%
c 64373
 
3.5%
. 63081
 
3.4%
Other values (69) 650296
35.6%
None
ValueCountFrequency (%)
ñ 2
66.7%
± 1
33.3%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%

geologicalContextID
Text

Missing 

Distinct7
Distinct (%)100.0%
Missing3814092
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:22.298276image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69
Median length43
Mean length45.28571429
Min length28

Characters and Unicode

Total characters317
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowNorth America, United States, California
2nd rowNorth America, United States, Oklahoma, Pontotoc County
3rd rowNorth America, United States, Alaska
4th rowNorth America, United States, Massachusetts
5th rowNorth America, United States, Arizona, Cochise
ValueCountFrequency (%)
north 7
17.5%
united 7
17.5%
states 7
17.5%
america 6
15.0%
county 2
 
5.0%
massachusetts 2
 
5.0%
california 1
 
2.5%
oklahoma 1
 
2.5%
pontotoc 1
 
2.5%
alaska 1
 
2.5%
Other values (5) 5
12.5%
2025-01-14T11:39:22.420703image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 39
12.3%
33
 
10.4%
a 28
 
8.8%
e 25
 
7.9%
s 18
 
5.7%
i 18
 
5.7%
o 16
 
5.0%
r 16
 
5.0%
, 16
 
5.0%
n 15
 
4.7%
Other values (20) 93
29.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 228
71.9%
Uppercase Letter 40
 
12.6%
Space Separator 33
 
10.4%
Other Punctuation 16
 
5.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 39
17.1%
a 28
12.3%
e 25
11.0%
s 18
7.9%
i 18
7.9%
o 16
7.0%
r 16
7.0%
n 15
 
6.6%
c 12
 
5.3%
h 11
 
4.8%
Other values (9) 30
13.2%
Uppercase Letter
ValueCountFrequency (%)
A 9
22.5%
N 7
17.5%
S 7
17.5%
U 7
17.5%
C 4
10.0%
O 2
 
5.0%
M 2
 
5.0%
P 1
 
2.5%
B 1
 
2.5%
Space Separator
ValueCountFrequency (%)
33
100.0%
Other Punctuation
ValueCountFrequency (%)
, 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 268
84.5%
Common 49
 
15.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 39
14.6%
a 28
 
10.4%
e 25
 
9.3%
s 18
 
6.7%
i 18
 
6.7%
o 16
 
6.0%
r 16
 
6.0%
n 15
 
5.6%
c 12
 
4.5%
h 11
 
4.1%
Other values (18) 70
26.1%
Common
ValueCountFrequency (%)
33
67.3%
, 16
32.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 317
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 39
12.3%
33
 
10.4%
a 28
 
8.8%
e 25
 
7.9%
s 18
 
5.7%
i 18
 
5.7%
o 16
 
5.0%
r 16
 
5.0%
, 16
 
5.0%
n 15
 
4.7%
Other values (20) 93
29.3%
Distinct8
Distinct (%)61.5%
Missing3814086
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:22.493499image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length67
Median length55
Mean length32.61538462
Min length13

Characters and Unicode

Total characters424
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)53.8%

Sample

1st rowNorth America
2nd rowPlantae, Dicotyledonae, Caryophyllales, Droseraceae
3rd rowNorth America
4th rowNorth America
5th rowNorth America
ValueCountFrequency (%)
north 7
16.7%
america 6
14.3%
plantae 5
11.9%
dicotyledonae 5
11.9%
caryophyllales 2
 
4.8%
myrtales 2
 
4.8%
myrtoideae 1
 
2.4%
fagales 1
 
2.4%
buprestidae 1
 
2.4%
coleoptera 1
 
2.4%
Other values (11) 11
26.2%
2025-01-14T11:39:22.627531image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 56
13.2%
e 49
11.6%
t 32
 
7.5%
29
 
6.8%
o 28
 
6.6%
r 26
 
6.1%
l 24
 
5.7%
, 21
 
5.0%
c 20
 
4.7%
i 19
 
4.5%
Other values (19) 120
28.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 332
78.3%
Uppercase Letter 42
 
9.9%
Space Separator 29
 
6.8%
Other Punctuation 21
 
5.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 56
16.9%
e 49
14.8%
t 32
9.6%
o 28
8.4%
r 26
7.8%
l 24
7.2%
c 20
 
6.0%
i 19
 
5.7%
n 16
 
4.8%
y 14
 
4.2%
Other values (7) 48
14.5%
Uppercase Letter
ValueCountFrequency (%)
A 9
21.4%
N 8
19.0%
D 6
14.3%
M 6
14.3%
P 5
11.9%
C 4
9.5%
O 1
 
2.4%
I 1
 
2.4%
B 1
 
2.4%
F 1
 
2.4%
Space Separator
ValueCountFrequency (%)
29
100.0%
Other Punctuation
ValueCountFrequency (%)
, 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 374
88.2%
Common 50
 
11.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 56
15.0%
e 49
13.1%
t 32
 
8.6%
o 28
 
7.5%
r 26
 
7.0%
l 24
 
6.4%
c 20
 
5.3%
i 19
 
5.1%
n 16
 
4.3%
y 14
 
3.7%
Other values (17) 90
24.1%
Common
ValueCountFrequency (%)
29
58.0%
, 21
42.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 424
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 56
13.2%
e 49
11.6%
t 32
 
7.5%
29
 
6.8%
o 28
 
6.6%
r 26
 
6.1%
l 24
 
5.7%
, 21
 
5.0%
c 20
 
4.7%
i 19
 
4.5%
Other values (19) 120
28.3%
Distinct4
Distinct (%)50.0%
Missing3814091
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:22.681344image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length7
Mean length9.375
Min length7

Characters and Unicode

Total characters75
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)37.5%

Sample

1st rowPlantae
2nd rowEarle, S. A.
3rd rowPlantae
4th rowNorth Atlantic Ocean
5th rowPlantae
ValueCountFrequency (%)
plantae 5
41.7%
earle 1
 
8.3%
s 1
 
8.3%
a 1
 
8.3%
north 1
 
8.3%
atlantic 1
 
8.3%
ocean 1
 
8.3%
animalia 1
 
8.3%
2025-01-14T11:39:22.797598image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 15
20.0%
n 8
10.7%
t 8
10.7%
l 8
10.7%
e 7
9.3%
P 5
 
6.7%
4
 
5.3%
A 3
 
4.0%
i 3
 
4.0%
r 2
 
2.7%
Other values (10) 12
16.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 56
74.7%
Uppercase Letter 12
 
16.0%
Space Separator 4
 
5.3%
Other Punctuation 3
 
4.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 15
26.8%
n 8
14.3%
t 8
14.3%
l 8
14.3%
e 7
12.5%
i 3
 
5.4%
r 2
 
3.6%
c 2
 
3.6%
h 1
 
1.8%
o 1
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
P 5
41.7%
A 3
25.0%
O 1
 
8.3%
S 1
 
8.3%
N 1
 
8.3%
E 1
 
8.3%
Other Punctuation
ValueCountFrequency (%)
. 2
66.7%
, 1
33.3%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 68
90.7%
Common 7
 
9.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 15
22.1%
n 8
11.8%
t 8
11.8%
l 8
11.8%
e 7
10.3%
P 5
 
7.4%
A 3
 
4.4%
i 3
 
4.4%
r 2
 
2.9%
c 2
 
2.9%
Other values (7) 7
10.3%
Common
ValueCountFrequency (%)
4
57.1%
. 2
28.6%
, 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 75
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 15
20.0%
n 8
10.7%
t 8
10.7%
l 8
10.7%
e 7
9.3%
P 5
 
6.7%
4
 
5.3%
A 3
 
4.0%
i 3
 
4.0%
r 2
 
2.7%
Other values (10) 12
16.0%
Distinct3
Distinct (%)100.0%
Missing3814096
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:22.850399image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length10
Mean length10.33333333
Min length10

Characters and Unicode

Total characters31
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row1935-06-26
2nd rowFluminicola
3rd rowArthropoda
ValueCountFrequency (%)
1935-06-26 1
33.3%
fluminicola 1
33.3%
arthropoda 1
33.3%
2025-01-14T11:39:22.960257image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 3
 
9.7%
- 2
 
6.5%
6 2
 
6.5%
r 2
 
6.5%
l 2
 
6.5%
a 2
 
6.5%
i 2
 
6.5%
1 1
 
3.2%
c 1
 
3.2%
p 1
 
3.2%
Other values (13) 13
41.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19
61.3%
Decimal Number 8
25.8%
Dash Punctuation 2
 
6.5%
Uppercase Letter 2
 
6.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 3
15.8%
r 2
10.5%
l 2
10.5%
a 2
10.5%
i 2
10.5%
c 1
 
5.3%
p 1
 
5.3%
h 1
 
5.3%
t 1
 
5.3%
m 1
 
5.3%
Other values (3) 3
15.8%
Decimal Number
ValueCountFrequency (%)
6 2
25.0%
1 1
12.5%
9 1
12.5%
2 1
12.5%
0 1
12.5%
5 1
12.5%
3 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
A 1
50.0%
F 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21
67.7%
Common 10
32.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 3
14.3%
r 2
 
9.5%
l 2
 
9.5%
a 2
 
9.5%
i 2
 
9.5%
c 1
 
4.8%
p 1
 
4.8%
h 1
 
4.8%
t 1
 
4.8%
A 1
 
4.8%
Other values (5) 5
23.8%
Common
ValueCountFrequency (%)
- 2
20.0%
6 2
20.0%
1 1
10.0%
9 1
10.0%
2 1
10.0%
0 1
10.0%
5 1
10.0%
3 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 3
 
9.7%
- 2
 
6.5%
6 2
 
6.5%
r 2
 
6.5%
l 2
 
6.5%
a 2
 
6.5%
i 2
 
6.5%
1 1
 
3.2%
c 1
 
3.2%
p 1
 
3.2%
Other values (13) 13
41.9%
Distinct2
Distinct (%)33.3%
Missing3814093
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:23.010818image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12
Min length7

Characters and Unicode

Total characters72
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)16.7%

Sample

1st rowDicotyledonae
2nd rowDicotyledonae
3rd rowDicotyledonae
4th rowDicotyledonae
5th rowInsecta
ValueCountFrequency (%)
dicotyledonae 5
83.3%
insecta 1
 
16.7%
2025-01-14T11:39:23.119801image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 11
15.3%
o 10
13.9%
c 6
8.3%
t 6
8.3%
n 6
8.3%
a 6
8.3%
D 5
6.9%
i 5
6.9%
y 5
6.9%
l 5
6.9%
Other values (3) 7
9.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 66
91.7%
Uppercase Letter 6
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 11
16.7%
o 10
15.2%
c 6
9.1%
t 6
9.1%
n 6
9.1%
a 6
9.1%
i 5
7.6%
y 5
7.6%
l 5
7.6%
d 5
7.6%
Uppercase Letter
ValueCountFrequency (%)
D 5
83.3%
I 1
 
16.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 72
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 11
15.3%
o 10
13.9%
c 6
8.3%
t 6
8.3%
n 6
8.3%
a 6
8.3%
D 5
6.9%
i 5
6.9%
y 5
6.9%
l 5
6.9%
Other values (3) 7
9.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 72
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 11
15.3%
o 10
13.9%
c 6
8.3%
t 6
8.3%
n 6
8.3%
a 6
8.3%
D 5
6.9%
i 5
6.9%
y 5
6.9%
l 5
6.9%
Other values (3) 7
9.7%
Distinct6
Distinct (%)42.9%
Missing3814085
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:23.174082image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length13.5
Mean length11.07142857
Min length3

Characters and Unicode

Total characters155
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)21.4%

Sample

1st rowUnited States
2nd rowCaryophyllales
3rd rowUnited States
4th rowUnited States
5th rowUnited States
ValueCountFrequency (%)
united 7
33.3%
states 7
33.3%
caryophyllales 2
 
9.5%
myrtales 2
 
9.5%
177 1
 
4.8%
coleoptera 1
 
4.8%
fagales 1
 
4.8%
2025-01-14T11:39:23.283255image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 24
15.5%
e 21
13.5%
a 16
10.3%
s 12
 
7.7%
l 10
 
6.5%
U 7
 
4.5%
i 7
 
4.5%
d 7
 
4.5%
7
 
4.5%
S 7
 
4.5%
Other values (12) 37
23.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 125
80.6%
Uppercase Letter 20
 
12.9%
Space Separator 7
 
4.5%
Decimal Number 3
 
1.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 24
19.2%
e 21
16.8%
a 16
12.8%
s 12
9.6%
l 10
8.0%
i 7
 
5.6%
d 7
 
5.6%
n 7
 
5.6%
y 6
 
4.8%
r 5
 
4.0%
Other values (4) 10
8.0%
Uppercase Letter
ValueCountFrequency (%)
U 7
35.0%
S 7
35.0%
C 3
15.0%
M 2
 
10.0%
F 1
 
5.0%
Decimal Number
ValueCountFrequency (%)
7 2
66.7%
1 1
33.3%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 145
93.5%
Common 10
 
6.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 24
16.6%
e 21
14.5%
a 16
11.0%
s 12
8.3%
l 10
 
6.9%
U 7
 
4.8%
i 7
 
4.8%
d 7
 
4.8%
S 7
 
4.8%
n 7
 
4.8%
Other values (9) 27
18.6%
Common
ValueCountFrequency (%)
7
70.0%
7 2
 
20.0%
1 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 155
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 24
15.5%
e 21
13.5%
a 16
10.3%
s 12
 
7.7%
l 10
 
6.5%
U 7
 
4.5%
i 7
 
4.5%
d 7
 
4.5%
7
 
4.5%
S 7
 
4.5%
Other values (12) 37
23.9%

latestPeriodOrHighestSystem
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:23.330787image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row177
ValueCountFrequency (%)
177 1
100.0%
2025-01-14T11:39:23.428901image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 2
66.7%
1 1
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 2
66.7%
1 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 3
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 2
66.7%
1 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 2
66.7%
1 1
33.3%
Distinct13
Distinct (%)92.9%
Missing3814085
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:23.489563image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length10.5
Mean length9.714285714
Min length3

Characters and Unicode

Total characters136
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)85.7%

Sample

1st rowCalifornia
2nd rowDroseraceae
3rd rowOklahoma
4th rowAlaska
5th rowMassachusetts
ValueCountFrequency (%)
massachusetts 2
14.3%
california 1
 
7.1%
droseraceae 1
 
7.1%
oklahoma 1
 
7.1%
alaska 1
 
7.1%
arizona 1
 
7.1%
melastomataceae 1
 
7.1%
1935 1
 
7.1%
nyctaginaceae 1
 
7.1%
sp 1
 
7.1%
Other values (3) 3
21.4%
2025-01-14T11:39:23.617508image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 27
19.9%
e 16
11.8%
s 14
 
10.3%
t 9
 
6.6%
c 8
 
5.9%
r 7
 
5.1%
i 6
 
4.4%
o 5
 
3.7%
n 4
 
2.9%
M 4
 
2.9%
Other values (22) 36
26.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 119
87.5%
Uppercase Letter 12
 
8.8%
Decimal Number 4
 
2.9%
Other Punctuation 1
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 27
22.7%
e 16
13.4%
s 14
11.8%
t 9
 
7.6%
c 8
 
6.7%
r 7
 
5.9%
i 6
 
5.0%
o 5
 
4.2%
n 4
 
3.4%
l 4
 
3.4%
Other values (10) 19
16.0%
Uppercase Letter
ValueCountFrequency (%)
M 4
33.3%
C 2
16.7%
A 2
16.7%
B 1
 
8.3%
N 1
 
8.3%
O 1
 
8.3%
D 1
 
8.3%
Decimal Number
ValueCountFrequency (%)
5 1
25.0%
3 1
25.0%
9 1
25.0%
1 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 131
96.3%
Common 5
 
3.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 27
20.6%
e 16
12.2%
s 14
10.7%
t 9
 
6.9%
c 8
 
6.1%
r 7
 
5.3%
i 6
 
4.6%
o 5
 
3.8%
n 4
 
3.1%
M 4
 
3.1%
Other values (17) 31
23.7%
Common
ValueCountFrequency (%)
5 1
20.0%
. 1
20.0%
3 1
20.0%
9 1
20.0%
1 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 136
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 27
19.9%
e 16
11.8%
s 14
 
10.3%
t 9
 
6.6%
c 8
 
5.9%
r 7
 
5.1%
i 6
 
4.4%
o 5
 
3.7%
n 4
 
2.9%
M 4
 
2.9%
Other values (22) 36
26.5%
Distinct5
Distinct (%)100.0%
Missing3814094
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:23.682437image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length58
Median length15
Mean length19.6
Min length1

Characters and Unicode

Total characters98
Distinct characters32
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)100.0%

Sample

1st rowPontotoc County
2nd rowNorth America, Mexico, Baja California Norte, Guadalupe I.
3rd rowCochise
4th rowBarnstable County
5th row6
ValueCountFrequency (%)
county 2
14.3%
pontotoc 1
 
7.1%
north 1
 
7.1%
america 1
 
7.1%
mexico 1
 
7.1%
baja 1
 
7.1%
california 1
 
7.1%
norte 1
 
7.1%
guadalupe 1
 
7.1%
i 1
 
7.1%
Other values (3) 3
21.4%
2025-01-14T11:39:23.810257image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 10
 
10.2%
a 9
 
9.2%
9
 
9.2%
t 7
 
7.1%
e 6
 
6.1%
n 5
 
5.1%
r 5
 
5.1%
i 5
 
5.1%
c 4
 
4.1%
C 4
 
4.1%
Other values (22) 34
34.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 71
72.4%
Uppercase Letter 13
 
13.3%
Space Separator 9
 
9.2%
Other Punctuation 4
 
4.1%
Decimal Number 1
 
1.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 10
14.1%
a 9
12.7%
t 7
9.9%
e 6
8.5%
n 5
 
7.0%
r 5
 
7.0%
i 5
 
7.0%
c 4
 
5.6%
u 4
 
5.6%
l 3
 
4.2%
Other values (10) 13
18.3%
Uppercase Letter
ValueCountFrequency (%)
C 4
30.8%
N 2
15.4%
B 2
15.4%
I 1
 
7.7%
G 1
 
7.7%
P 1
 
7.7%
M 1
 
7.7%
A 1
 
7.7%
Other Punctuation
ValueCountFrequency (%)
, 3
75.0%
. 1
 
25.0%
Space Separator
ValueCountFrequency (%)
9
100.0%
Decimal Number
ValueCountFrequency (%)
6 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 84
85.7%
Common 14
 
14.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 10
 
11.9%
a 9
 
10.7%
t 7
 
8.3%
e 6
 
7.1%
n 5
 
6.0%
r 5
 
6.0%
i 5
 
6.0%
c 4
 
4.8%
C 4
 
4.8%
u 4
 
4.8%
Other values (18) 25
29.8%
Common
ValueCountFrequency (%)
9
64.3%
, 3
 
21.4%
. 1
 
7.1%
6 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 98
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 10
 
10.2%
a 9
 
9.2%
9
 
9.2%
t 7
 
7.1%
e 6
 
6.1%
n 5
 
5.1%
r 5
 
5.1%
i 5
 
5.1%
c 4
 
4.1%
C 4
 
4.1%
Other values (22) 34
34.7%
Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:23.859367image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length7.5
Mean length7.5
Min length2

Characters and Unicode

Total characters15
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowNorth America
2nd row26
ValueCountFrequency (%)
north 1
33.3%
america 1
33.3%
26 1
33.3%
2025-01-14T11:39:23.962696image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 2
13.3%
N 1
 
6.7%
o 1
 
6.7%
t 1
 
6.7%
h 1
 
6.7%
1
 
6.7%
A 1
 
6.7%
m 1
 
6.7%
e 1
 
6.7%
i 1
 
6.7%
Other values (4) 4
26.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10
66.7%
Uppercase Letter 2
 
13.3%
Decimal Number 2
 
13.3%
Space Separator 1
 
6.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 2
20.0%
o 1
10.0%
t 1
10.0%
h 1
10.0%
m 1
10.0%
e 1
10.0%
i 1
10.0%
c 1
10.0%
a 1
10.0%
Uppercase Letter
ValueCountFrequency (%)
N 1
50.0%
A 1
50.0%
Decimal Number
ValueCountFrequency (%)
2 1
50.0%
6 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
80.0%
Common 3
 
20.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 2
16.7%
N 1
8.3%
o 1
8.3%
t 1
8.3%
h 1
8.3%
A 1
8.3%
m 1
8.3%
e 1
8.3%
i 1
8.3%
c 1
8.3%
Common
ValueCountFrequency (%)
1
33.3%
2 1
33.3%
6 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 2
13.3%
N 1
 
6.7%
o 1
 
6.7%
t 1
 
6.7%
h 1
 
6.7%
1
 
6.7%
A 1
 
6.7%
m 1
 
6.7%
e 1
 
6.7%
i 1
 
6.7%
Other values (4) 4
26.7%
Distinct6
Distinct (%)85.7%
Missing3814092
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:24.025717image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length49
Median length13
Mean length14.71428571
Min length3

Characters and Unicode

Total characters103
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)71.4%

Sample

1st rowSan Francisco
2nd rowAda
3rd rowSeldovia
4th rowScharf, U.
5th rowWoods Hole
ValueCountFrequency (%)
woods 2
12.5%
hole 2
12.5%
san 1
 
6.2%
francisco 1
 
6.2%
ada 1
 
6.2%
seldovia 1
 
6.2%
scharf 1
 
6.2%
u 1
 
6.2%
chiricahua 1
 
6.2%
mountains 1
 
6.2%
Other values (4) 4
25.0%
2025-01-14T11:39:24.145894image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 13
 
12.6%
a 10
 
9.7%
9
 
8.7%
s 6
 
5.8%
n 6
 
5.8%
r 5
 
4.9%
l 5
 
4.9%
i 5
 
4.9%
d 4
 
3.9%
c 4
 
3.9%
Other values (20) 36
35.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 75
72.8%
Uppercase Letter 14
 
13.6%
Space Separator 9
 
8.7%
Other Punctuation 5
 
4.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 13
17.3%
a 10
13.3%
s 6
8.0%
n 6
8.0%
r 5
 
6.7%
l 5
 
6.7%
i 5
 
6.7%
d 4
 
5.3%
c 4
 
5.3%
h 3
 
4.0%
Other values (7) 14
18.7%
Uppercase Letter
ValueCountFrequency (%)
S 3
21.4%
W 2
14.3%
H 2
14.3%
P 1
 
7.1%
M 1
 
7.1%
B 1
 
7.1%
A 1
 
7.1%
C 1
 
7.1%
U 1
 
7.1%
F 1
 
7.1%
Other Punctuation
ValueCountFrequency (%)
, 3
60.0%
. 2
40.0%
Space Separator
ValueCountFrequency (%)
9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 89
86.4%
Common 14
 
13.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 13
14.6%
a 10
 
11.2%
s 6
 
6.7%
n 6
 
6.7%
r 5
 
5.6%
l 5
 
5.6%
i 5
 
5.6%
d 4
 
4.5%
c 4
 
4.5%
S 3
 
3.4%
Other values (17) 28
31.5%
Common
ValueCountFrequency (%)
9
64.3%
, 3
 
21.4%
. 2
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 103
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 13
 
12.6%
a 10
 
9.7%
9
 
8.7%
s 6
 
5.8%
n 6
 
5.8%
r 5
 
4.9%
l 5
 
4.9%
i 5
 
4.9%
d 4
 
3.9%
c 4
 
3.9%
Other values (20) 36
35.0%
Distinct6
Distinct (%)100.0%
Missing3814093
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:24.209601image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length7.833333333
Min length6

Characters and Unicode

Total characters47
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)100.0%

Sample

1st rowDrosera
2nd rowMiconia
3rd rowBoerhavia
4th rowMyrcia
5th rowBuprestis
ValueCountFrequency (%)
drosera 1
16.7%
miconia 1
16.7%
boerhavia 1
16.7%
myrcia 1
16.7%
buprestis 1
16.7%
casuarina 1
16.7%
2025-01-14T11:39:24.329450image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
17.0%
i 6
12.8%
r 6
12.8%
s 4
8.5%
o 3
 
6.4%
e 3
 
6.4%
n 2
 
4.3%
u 2
 
4.3%
M 2
 
4.3%
c 2
 
4.3%
Other values (8) 9
19.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 41
87.2%
Uppercase Letter 6
 
12.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
19.5%
i 6
14.6%
r 6
14.6%
s 4
9.8%
o 3
 
7.3%
e 3
 
7.3%
n 2
 
4.9%
u 2
 
4.9%
c 2
 
4.9%
t 1
 
2.4%
Other values (4) 4
9.8%
Uppercase Letter
ValueCountFrequency (%)
M 2
33.3%
B 2
33.3%
D 1
16.7%
C 1
16.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 47
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
17.0%
i 6
12.8%
r 6
12.8%
s 4
8.5%
o 3
 
6.4%
e 3
 
6.4%
n 2
 
4.3%
u 2
 
4.3%
M 2
 
4.3%
c 2
 
4.3%
Other values (8) 9
19.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 47
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
17.0%
i 6
12.8%
r 6
12.8%
s 4
8.5%
o 3
 
6.4%
e 3
 
6.4%
n 2
 
4.3%
u 2
 
4.3%
M 2
 
4.3%
c 2
 
4.3%
Other values (8) 9
19.1%
Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:24.380893image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length9
Mean length9
Min length6

Characters and Unicode

Total characters18
Distinct characters15
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowGuadalupe I.
2nd row2438.0
ValueCountFrequency (%)
guadalupe 1
33.3%
i 1
33.3%
2438.0 1
33.3%
2025-01-14T11:39:24.490809image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
u 2
 
11.1%
a 2
 
11.1%
. 2
 
11.1%
G 1
 
5.6%
d 1
 
5.6%
l 1
 
5.6%
p 1
 
5.6%
e 1
 
5.6%
1
 
5.6%
I 1
 
5.6%
Other values (5) 5
27.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8
44.4%
Decimal Number 5
27.8%
Other Punctuation 2
 
11.1%
Uppercase Letter 2
 
11.1%
Space Separator 1
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 2
25.0%
a 2
25.0%
d 1
12.5%
l 1
12.5%
p 1
12.5%
e 1
12.5%
Decimal Number
ValueCountFrequency (%)
2 1
20.0%
4 1
20.0%
3 1
20.0%
8 1
20.0%
0 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
G 1
50.0%
I 1
50.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
55.6%
Common 8
44.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 2
20.0%
a 2
20.0%
G 1
10.0%
d 1
10.0%
l 1
10.0%
p 1
10.0%
e 1
10.0%
I 1
10.0%
Common
ValueCountFrequency (%)
. 2
25.0%
1
12.5%
2 1
12.5%
4 1
12.5%
3 1
12.5%
8 1
12.5%
0 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u 2
 
11.1%
a 2
 
11.1%
. 2
 
11.1%
G 1
 
5.6%
d 1
 
5.6%
l 1
 
5.6%
p 1
 
5.6%
e 1
 
5.6%
1
 
5.6%
I 1
 
5.6%
Other values (5) 5
27.8%
Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:24.542083image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length14
Mean length14
Min length6

Characters and Unicode

Total characters28
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowCampanula rotundifolia
2nd rowMexico
ValueCountFrequency (%)
campanula 1
33.3%
rotundifolia 1
33.3%
mexico 1
33.3%
2025-01-14T11:39:24.662467image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
14.3%
o 3
 
10.7%
i 3
 
10.7%
n 2
 
7.1%
u 2
 
7.1%
l 2
 
7.1%
d 1
 
3.6%
x 1
 
3.6%
e 1
 
3.6%
M 1
 
3.6%
Other values (8) 8
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25
89.3%
Uppercase Letter 2
 
7.1%
Space Separator 1
 
3.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
16.0%
o 3
12.0%
i 3
12.0%
n 2
 
8.0%
u 2
 
8.0%
l 2
 
8.0%
d 1
 
4.0%
x 1
 
4.0%
e 1
 
4.0%
f 1
 
4.0%
Other values (5) 5
20.0%
Uppercase Letter
ValueCountFrequency (%)
M 1
50.0%
C 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 27
96.4%
Common 1
 
3.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
14.8%
o 3
11.1%
i 3
11.1%
n 2
 
7.4%
u 2
 
7.4%
l 2
 
7.4%
d 1
 
3.7%
x 1
 
3.7%
e 1
 
3.7%
M 1
 
3.7%
Other values (7) 7
25.9%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
14.3%
o 3
 
10.7%
i 3
 
10.7%
n 2
 
7.1%
u 2
 
7.1%
l 2
 
7.1%
d 1
 
3.6%
x 1
 
3.6%
e 1
 
3.6%
M 1
 
3.6%
Other values (8) 8
28.6%

formation
Text

Missing 

Distinct7
Distinct (%)100.0%
Missing3814092
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:24.723023image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length9
Mean length8.857142857
Min length3

Characters and Unicode

Total characters62
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowsp.
2nd rowBaja California Norte
3rd rowcoronata
4th rowdiffusa
5th rowsplendens
ValueCountFrequency (%)
sp 1
11.1%
baja 1
11.1%
california 1
11.1%
norte 1
11.1%
coronata 1
11.1%
diffusa 1
11.1%
splendens 1
11.1%
fulgens 1
11.1%
stricta 1
11.1%
2025-01-14T11:39:24.855356image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
12.9%
s 6
 
9.7%
n 5
 
8.1%
i 4
 
6.5%
e 4
 
6.5%
t 4
 
6.5%
r 4
 
6.5%
o 4
 
6.5%
f 4
 
6.5%
l 3
 
4.8%
Other values (11) 16
25.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 56
90.3%
Uppercase Letter 3
 
4.8%
Space Separator 2
 
3.2%
Other Punctuation 1
 
1.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
14.3%
s 6
10.7%
n 5
8.9%
i 4
 
7.1%
e 4
 
7.1%
t 4
 
7.1%
r 4
 
7.1%
o 4
 
7.1%
f 4
 
7.1%
l 3
 
5.4%
Other values (6) 10
17.9%
Uppercase Letter
ValueCountFrequency (%)
C 1
33.3%
N 1
33.3%
B 1
33.3%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 59
95.2%
Common 3
 
4.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
13.6%
s 6
10.2%
n 5
 
8.5%
i 4
 
6.8%
e 4
 
6.8%
t 4
 
6.8%
r 4
 
6.8%
o 4
 
6.8%
f 4
 
6.8%
l 3
 
5.1%
Other values (9) 13
22.0%
Common
ValueCountFrequency (%)
2
66.7%
. 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 62
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
12.9%
s 6
 
9.7%
n 5
 
8.1%
i 4
 
6.5%
e 4
 
6.5%
t 4
 
6.5%
r 4
 
6.5%
o 4
 
6.5%
f 4
 
6.5%
l 3
 
4.8%
Other values (11) 16
25.8%

member
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:24.909942image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length15
Mean length15
Min length12

Characters and Unicode

Total characters30
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowColpomenia sinuosa
2nd rowOchtodes sp.
ValueCountFrequency (%)
colpomenia 1
25.0%
sinuosa 1
25.0%
ochtodes 1
25.0%
sp 1
25.0%
2025-01-14T11:39:25.021825image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 4
13.3%
s 4
13.3%
2
 
6.7%
p 2
 
6.7%
e 2
 
6.7%
n 2
 
6.7%
i 2
 
6.7%
a 2
 
6.7%
c 1
 
3.3%
d 1
 
3.3%
Other values (8) 8
26.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25
83.3%
Space Separator 2
 
6.7%
Uppercase Letter 2
 
6.7%
Other Punctuation 1
 
3.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 4
16.0%
s 4
16.0%
p 2
8.0%
e 2
8.0%
n 2
8.0%
i 2
8.0%
a 2
8.0%
c 1
 
4.0%
d 1
 
4.0%
t 1
 
4.0%
Other values (4) 4
16.0%
Uppercase Letter
ValueCountFrequency (%)
C 1
50.0%
O 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 27
90.0%
Common 3
 
10.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 4
14.8%
s 4
14.8%
p 2
 
7.4%
e 2
 
7.4%
n 2
 
7.4%
i 2
 
7.4%
a 2
 
7.4%
c 1
 
3.7%
d 1
 
3.7%
t 1
 
3.7%
Other values (6) 6
22.2%
Common
ValueCountFrequency (%)
2
66.7%
. 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 4
13.3%
s 4
13.3%
2
 
6.7%
p 2
 
6.7%
e 2
 
6.7%
n 2
 
6.7%
i 2
 
6.7%
a 2
 
6.7%
c 1
 
3.3%
d 1
 
3.3%
Other values (8) 8
26.7%

bed
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:25.071337image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length17
Mean length17
Min length17

Characters and Unicode

Total characters17
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowRiccardia pinguis
ValueCountFrequency (%)
riccardia 1
50.0%
pinguis 1
50.0%
2025-01-14T11:39:25.194114image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 4
23.5%
c 2
11.8%
a 2
11.8%
R 1
 
5.9%
r 1
 
5.9%
d 1
 
5.9%
1
 
5.9%
p 1
 
5.9%
n 1
 
5.9%
g 1
 
5.9%
Other values (2) 2
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15
88.2%
Uppercase Letter 1
 
5.9%
Space Separator 1
 
5.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 4
26.7%
c 2
13.3%
a 2
13.3%
r 1
 
6.7%
d 1
 
6.7%
p 1
 
6.7%
n 1
 
6.7%
g 1
 
6.7%
u 1
 
6.7%
s 1
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
R 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16
94.1%
Common 1
 
5.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 4
25.0%
c 2
12.5%
a 2
12.5%
R 1
 
6.2%
r 1
 
6.2%
d 1
 
6.2%
p 1
 
6.2%
n 1
 
6.2%
g 1
 
6.2%
u 1
 
6.2%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 4
23.5%
c 2
11.8%
a 2
11.8%
R 1
 
5.9%
r 1
 
5.9%
d 1
 
5.9%
1
 
5.9%
p 1
 
5.9%
n 1
 
5.9%
g 1
 
5.9%
Other values (2) 2
11.8%

identificationID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:25.259217image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length34
Median length34
Mean length34
Min length34

Characters and Unicode

Total characters34
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowGuadalupe Island, Baja California.
ValueCountFrequency (%)
guadalupe 1
25.0%
island 1
25.0%
baja 1
25.0%
california 1
25.0%
2025-01-14T11:39:25.368436image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 7
20.6%
l 3
 
8.8%
3
 
8.8%
n 2
 
5.9%
d 2
 
5.9%
i 2
 
5.9%
u 2
 
5.9%
j 1
 
2.9%
r 1
 
2.9%
o 1
 
2.9%
Other values (10) 10
29.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25
73.5%
Uppercase Letter 4
 
11.8%
Space Separator 3
 
8.8%
Other Punctuation 2
 
5.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 7
28.0%
l 3
12.0%
n 2
 
8.0%
d 2
 
8.0%
i 2
 
8.0%
u 2
 
8.0%
j 1
 
4.0%
r 1
 
4.0%
o 1
 
4.0%
f 1
 
4.0%
Other values (3) 3
12.0%
Uppercase Letter
ValueCountFrequency (%)
C 1
25.0%
G 1
25.0%
B 1
25.0%
I 1
25.0%
Other Punctuation
ValueCountFrequency (%)
, 1
50.0%
. 1
50.0%
Space Separator
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 29
85.3%
Common 5
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 7
24.1%
l 3
10.3%
n 2
 
6.9%
d 2
 
6.9%
i 2
 
6.9%
u 2
 
6.9%
j 1
 
3.4%
r 1
 
3.4%
o 1
 
3.4%
f 1
 
3.4%
Other values (7) 7
24.1%
Common
ValueCountFrequency (%)
3
60.0%
, 1
 
20.0%
. 1
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 7
20.6%
l 3
 
8.8%
3
 
8.8%
n 2
 
5.9%
d 2
 
5.9%
i 2
 
5.9%
u 2
 
5.9%
j 1
 
2.9%
r 1
 
2.9%
o 1
 
2.9%
Other values (10) 10
29.4%
Distinct32
Distinct (%)0.2%
Missing3799723
Missing (%)99.6%
Memory size29.1 MiB
2025-01-14T11:39:25.423218image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length64
Median length3
Mean length4.326377295
Min length2

Characters and Unicode

Total characters62196
Distinct characters36
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)0.1%

Sample

1st rownear
2nd rowcf.
3rd rowcf.
4th rowvel aff.
5th rowvel aff.
ValueCountFrequency (%)
cf 9529
65.3%
uncertain 2623
 
18.0%
aff 1483
 
10.2%
near 410
 
2.8%
s.l 211
 
1.4%
vel 146
 
1.0%
group 45
 
0.3%
sp 38
 
0.3%
subgroup 35
 
0.2%
nov 23
 
0.2%
Other values (23) 53
 
0.4%
2025-01-14T11:39:25.544550image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 12495
20.1%
c 12171
19.6%
. 11441
18.4%
n 5696
9.2%
a 4533
 
7.3%
e 3208
 
5.2%
r 3120
 
5.0%
u 2664
 
4.3%
t 2636
 
4.2%
i 2632
 
4.2%
Other values (26) 1600
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 50426
81.1%
Other Punctuation 11447
 
18.4%
Space Separator 220
 
0.4%
Uppercase Letter 99
 
0.2%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 12495
24.8%
c 12171
24.1%
n 5696
11.3%
a 4533
 
9.0%
e 3208
 
6.4%
r 3120
 
6.2%
u 2664
 
5.3%
t 2636
 
5.2%
i 2632
 
5.2%
l 376
 
0.7%
Other values (12) 895
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
U 81
81.8%
C 5
 
5.1%
D 3
 
3.0%
A 2
 
2.0%
B 2
 
2.0%
S 2
 
2.0%
L 2
 
2.0%
P 1
 
1.0%
N 1
 
1.0%
Other Punctuation
ValueCountFrequency (%)
. 11441
99.9%
, 6
 
0.1%
Space Separator
ValueCountFrequency (%)
220
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 50525
81.2%
Common 11671
 
18.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 12495
24.7%
c 12171
24.1%
n 5696
11.3%
a 4533
 
9.0%
e 3208
 
6.3%
r 3120
 
6.2%
u 2664
 
5.3%
t 2636
 
5.2%
i 2632
 
5.2%
l 376
 
0.7%
Other values (21) 994
 
2.0%
Common
ValueCountFrequency (%)
. 11441
98.0%
220
 
1.9%
, 6
 
0.1%
( 2
 
< 0.1%
) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 62196
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 12495
20.1%
c 12171
19.6%
. 11441
18.4%
n 5696
9.2%
a 4533
 
7.3%
e 3208
 
5.2%
r 3120
 
5.0%
u 2664
 
4.3%
t 2636
 
4.2%
i 2632
 
4.2%
Other values (26) 1600
 
2.6%

typeStatus
Text

Missing 

Distinct254
Distinct (%)0.2%
Missing3664511
Missing (%)96.1%
Memory size29.1 MiB
2025-01-14T11:39:25.610306image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length60
Median length8
Mean length7.885899938
Min length1

Characters and Unicode

Total characters1179636
Distinct characters45
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique108 ?
Unique (%)0.1%

Sample

1st rowType
2nd rowHolotype
3rd rowType
4th rowHolotype
5th rowHolotype
ValueCountFrequency (%)
holotype 43277
26.9%
paratype 31271
19.4%
type 25920
16.1%
isotype 25475
15.8%
syntype 13189
 
8.2%
collection 3918
 
2.4%
lectotype 3353
 
2.1%
isosyntype 2798
 
1.7%
fragment 2239
 
1.4%
allotype 1694
 
1.1%
Other values (55) 7712
 
4.8%
2025-01-14T11:39:25.883719image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
y 168855
14.3%
e 165218
14.0%
p 151601
12.9%
t 138010
11.7%
o 134667
11.4%
a 69582
 
5.9%
l 58287
 
4.9%
H 43384
 
3.7%
r 37775
 
3.2%
s 34976
 
3.0%
Other values (35) 177281
15.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1006069
85.3%
Uppercase Letter 160404
 
13.6%
Space Separator 11258
 
1.0%
Other Punctuation 1464
 
0.1%
Math Symbol 437
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
y 168855
16.8%
e 165218
16.4%
p 151601
15.1%
t 138010
13.7%
o 134667
13.4%
a 69582
6.9%
l 58287
 
5.8%
r 37775
 
3.8%
s 34976
 
3.5%
n 23037
 
2.3%
Other values (11) 24061
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
H 43384
27.0%
P 34860
21.7%
I 29652
18.5%
T 25922
16.2%
S 13362
 
8.3%
C 4936
 
3.1%
L 3359
 
2.1%
F 2239
 
1.4%
A 1697
 
1.1%
N 452
 
0.3%
Other values (7) 541
 
0.3%
Other Punctuation
ValueCountFrequency (%)
; 1428
97.5%
? 34
 
2.3%
. 2
 
0.1%
Space Separator
ValueCountFrequency (%)
11258
100.0%
Math Symbol
ValueCountFrequency (%)
+ 437
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1166473
98.9%
Common 13163
 
1.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
y 168855
14.5%
e 165218
14.2%
p 151601
13.0%
t 138010
11.8%
o 134667
11.5%
a 69582
6.0%
l 58287
 
5.0%
H 43384
 
3.7%
r 37775
 
3.2%
s 34976
 
3.0%
Other values (28) 164118
14.1%
Common
ValueCountFrequency (%)
11258
85.5%
; 1428
 
10.8%
+ 437
 
3.3%
? 34
 
0.3%
( 2
 
< 0.1%
) 2
 
< 0.1%
. 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1179636
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
y 168855
14.3%
e 165218
14.0%
p 151601
12.9%
t 138010
11.7%
o 134667
11.4%
a 69582
 
5.9%
l 58287
 
4.9%
H 43384
 
3.7%
r 37775
 
3.2%
s 34976
 
3.0%
Other values (35) 177281
15.0%

identifiedBy
Text

Missing 

Distinct18525
Distinct (%)2.8%
Missing3157857
Missing (%)82.8%
Memory size29.1 MiB
2025-01-14T11:39:26.079913image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length226
Median length141
Mean length36.93504987
Min length2

Characters and Unicode

Total characters24238331
Distinct characters109
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6662 ?
Unique (%)1.0%

Sample

1st rowBadley, J. E.
2nd rowStrong, M. T., (US), Smithsonian Institution - National Museum of Natural History (UNITED STATES)
3rd rowJohnson, M. W.
4th rowZibrowius, Helmut, (CNRS-UA 41), Centre d'Oceanologie de Marseille (CNRS-UA 41) (FRANCE)
5th rowFoster, W. D.
ValueCountFrequency (%)
of 165315
 
4.6%
museum 142804
 
3.9%
national 141794
 
3.9%
institution 137456
 
3.8%
smithsonian 136494
 
3.8%
natural 136113
 
3.8%
history 135920
 
3.8%
united 123630
 
3.4%
states 123300
 
3.4%
98263
 
2.7%
Other values (13036) 2275712
62.9%
2025-01-14T11:39:26.351338image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2960559
 
12.2%
a 1451424
 
6.0%
t 1441525
 
5.9%
i 1422745
 
5.9%
n 1327574
 
5.5%
o 1302984
 
5.4%
e 1067690
 
4.4%
, 1034400
 
4.3%
r 1025293
 
4.2%
s 943871
 
3.9%
Other values (99) 10260266
42.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13847813
57.1%
Uppercase Letter 4898288
 
20.2%
Space Separator 2960559
 
12.2%
Other Punctuation 1906878
 
7.9%
Open Punctuation 252924
 
1.0%
Close Punctuation 252924
 
1.0%
Dash Punctuation 116493
 
0.5%
Decimal Number 2356
 
< 0.1%
Math Symbol 39
 
< 0.1%
Initial Punctuation 28
 
< 0.1%
Other values (2) 29
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1451424
10.5%
t 1441525
10.4%
i 1422745
10.3%
n 1327574
9.6%
o 1302984
9.4%
e 1067690
7.7%
r 1025293
7.4%
s 943871
 
6.8%
u 787715
 
5.7%
l 695948
 
5.0%
Other values (42) 2381044
17.2%
Uppercase Letter
ValueCountFrequency (%)
S 576195
11.8%
T 484786
 
9.9%
N 467905
 
9.6%
E 358991
 
7.3%
M 338951
 
6.9%
I 330825
 
6.8%
A 282381
 
5.8%
H 279825
 
5.7%
D 247568
 
5.1%
U 197622
 
4.0%
Other values (21) 1333239
27.2%
Other Punctuation
ValueCountFrequency (%)
, 1034400
54.2%
. 822681
43.1%
; 37495
 
2.0%
/ 6890
 
0.4%
& 2690
 
0.1%
' 2180
 
0.1%
" 522
 
< 0.1%
¡ 14
 
< 0.1%
? 6
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1102
46.8%
4 1101
46.7%
2 59
 
2.5%
0 34
 
1.4%
9 31
 
1.3%
6 29
 
1.2%
Open Punctuation
ValueCountFrequency (%)
( 251996
99.6%
[ 928
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 251996
99.6%
] 928
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 116489
> 99.9%
4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2960559
100.0%
Math Symbol
ValueCountFrequency (%)
+ 39
100.0%
Initial Punctuation
ValueCountFrequency (%)
28
100.0%
Final Punctuation
ValueCountFrequency (%)
28
100.0%
Currency Symbol
ValueCountFrequency (%)
¢ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18746101
77.3%
Common 5492230
 
22.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1451424
 
7.7%
t 1441525
 
7.7%
i 1422745
 
7.6%
n 1327574
 
7.1%
o 1302984
 
7.0%
e 1067690
 
5.7%
r 1025293
 
5.5%
s 943871
 
5.0%
u 787715
 
4.2%
l 695948
 
3.7%
Other values (73) 7279332
38.8%
Common
ValueCountFrequency (%)
2960559
53.9%
, 1034400
 
18.8%
. 822681
 
15.0%
( 251996
 
4.6%
) 251996
 
4.6%
- 116489
 
2.1%
; 37495
 
0.7%
/ 6890
 
0.1%
& 2690
 
< 0.1%
' 2180
 
< 0.1%
Other values (16) 4854
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24222691
99.9%
None 15580
 
0.1%
Punctuation 60
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2960559
 
12.2%
a 1451424
 
6.0%
t 1441525
 
6.0%
i 1422745
 
5.9%
n 1327574
 
5.5%
o 1302984
 
5.4%
e 1067690
 
4.4%
, 1034400
 
4.3%
r 1025293
 
4.2%
s 943871
 
3.9%
Other values (63) 10244626
42.3%
None
ValueCountFrequency (%)
í 8096
52.0%
é 1950
 
12.5%
á 1861
 
11.9%
ñ 771
 
4.9%
ö 715
 
4.6%
ü 490
 
3.1%
ó 443
 
2.8%
ä 332
 
2.1%
ã 286
 
1.8%
ú 135
 
0.9%
Other values (23) 501
 
3.2%
Punctuation
ValueCountFrequency (%)
28
46.7%
28
46.7%
4
 
6.7%

identifiedByID
Text

Missing 

Distinct7
Distinct (%)100.0%
Missing3814092
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:26.431538image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69
Median length7
Mean length25.28571429
Min length7

Characters and Unicode

Total characters177
Distinct characters38
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st row37.7749
2nd rowChromista, Ochrophyta, Phaeophyceae, Ectocarpales, Scytosiphonaceae
3rd row34.7745
4th rowDicotyledonae
5th row59.4381
ValueCountFrequency (%)
37.7749 1
 
6.7%
chromista 1
 
6.7%
ochrophyta 1
 
6.7%
phaeophyceae 1
 
6.7%
ectocarpales 1
 
6.7%
scytosiphonaceae 1
 
6.7%
34.7745 1
 
6.7%
dicotyledonae 1
 
6.7%
59.4381 1
 
6.7%
41.5265 1
 
6.7%
Other values (5) 5
33.3%
2025-01-14T11:39:26.568303image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 17
 
9.6%
e 15
 
8.5%
o 13
 
7.3%
h 11
 
6.2%
c 9
 
5.1%
i 8
 
4.5%
t 8
 
4.5%
8
 
4.5%
, 8
 
4.5%
p 7
 
4.0%
Other values (28) 73
41.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 122
68.9%
Decimal Number 24
 
13.6%
Other Punctuation 12
 
6.8%
Uppercase Letter 11
 
6.2%
Space Separator 8
 
4.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 17
13.9%
e 15
12.3%
o 13
10.7%
h 11
9.0%
c 9
7.4%
i 8
 
6.6%
t 8
 
6.6%
p 7
 
5.7%
l 7
 
5.7%
y 7
 
5.7%
Other values (7) 20
16.4%
Decimal Number
ValueCountFrequency (%)
4 5
20.8%
7 5
20.8%
5 4
16.7%
3 3
12.5%
1 2
 
8.3%
9 2
 
8.3%
6 1
 
4.2%
2 1
 
4.2%
8 1
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
P 2
18.2%
R 2
18.2%
G 1
9.1%
F 1
9.1%
E 1
9.1%
D 1
9.1%
C 1
9.1%
S 1
9.1%
O 1
9.1%
Other Punctuation
ValueCountFrequency (%)
, 8
66.7%
. 4
33.3%
Space Separator
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 133
75.1%
Common 44
 
24.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 17
12.8%
e 15
11.3%
o 13
9.8%
h 11
 
8.3%
c 9
 
6.8%
i 8
 
6.0%
t 8
 
6.0%
p 7
 
5.3%
l 7
 
5.3%
y 7
 
5.3%
Other values (16) 31
23.3%
Common
ValueCountFrequency (%)
8
18.2%
, 8
18.2%
4 5
11.4%
7 5
11.4%
5 4
9.1%
. 4
9.1%
3 3
 
6.8%
1 2
 
4.5%
9 2
 
4.5%
6 1
 
2.3%
Other values (2) 2
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 177
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 17
 
9.6%
e 15
 
8.5%
o 13
 
7.3%
h 11
 
6.2%
c 9
 
5.1%
i 8
 
4.5%
t 8
 
4.5%
8
 
4.5%
, 8
 
4.5%
p 7
 
4.0%
Other values (28) 73
41.2%

dateIdentified
Text

Missing 

Distinct9
Distinct (%)100.0%
Missing3814090
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:26.637619image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69
Median length18
Mean length16
Min length7

Characters and Unicode

Total characters144
Distinct characters38
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)100.0%

Sample

1st row-122.419
2nd rowChromista
3rd row-96.6783
4th rowAsterales
5th row-151.711
ValueCountFrequency (%)
plantae 2
14.3%
122.419 1
 
7.1%
chromista 1
 
7.1%
96.6783 1
 
7.1%
asterales 1
 
7.1%
151.711 1
 
7.1%
guatteria 1
 
7.1%
punctata 1
 
7.1%
70.6731 1
 
7.1%
marchantiophyta 1
 
7.1%
Other values (3) 3
21.4%
2025-01-14T11:39:26.765816image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 18
 
12.5%
e 12
 
8.3%
t 11
 
7.6%
n 8
 
5.6%
r 7
 
4.9%
1 7
 
4.9%
i 6
 
4.2%
s 5
 
3.5%
5
 
3.5%
u 4
 
2.8%
Other values (28) 61
42.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 94
65.3%
Decimal Number 24
 
16.7%
Uppercase Letter 9
 
6.2%
Other Punctuation 8
 
5.6%
Space Separator 5
 
3.5%
Dash Punctuation 4
 
2.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 18
19.1%
e 12
12.8%
t 11
11.7%
n 8
8.5%
r 7
 
7.4%
i 6
 
6.4%
s 5
 
5.3%
u 4
 
4.3%
l 4
 
4.3%
o 3
 
3.2%
Other values (8) 16
17.0%
Decimal Number
ValueCountFrequency (%)
1 7
29.2%
7 4
16.7%
6 3
12.5%
3 2
 
8.3%
9 2
 
8.3%
2 2
 
8.3%
5 1
 
4.2%
8 1
 
4.2%
0 1
 
4.2%
4 1
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
P 2
22.2%
M 2
22.2%
A 2
22.2%
G 1
11.1%
C 1
11.1%
J 1
11.1%
Other Punctuation
ValueCountFrequency (%)
, 4
50.0%
. 4
50.0%
Space Separator
ValueCountFrequency (%)
5
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 103
71.5%
Common 41
 
28.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 18
17.5%
e 12
11.7%
t 11
10.7%
n 8
 
7.8%
r 7
 
6.8%
i 6
 
5.8%
s 5
 
4.9%
u 4
 
3.9%
l 4
 
3.9%
o 3
 
2.9%
Other values (14) 25
24.3%
Common
ValueCountFrequency (%)
1 7
17.1%
5
12.2%
, 4
9.8%
7 4
9.8%
. 4
9.8%
- 4
9.8%
6 3
7.3%
3 2
 
4.9%
9 2
 
4.9%
2 2
 
4.9%
Other values (4) 4
9.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 144
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 18
 
12.5%
e 12
 
8.3%
t 11
 
7.6%
n 8
 
5.6%
r 7
 
4.9%
1 7
 
4.9%
i 6
 
4.2%
s 5
 
3.5%
5
 
3.5%
u 4
 
2.8%
Other values (28) 61
42.4%
Distinct5
Distinct (%)83.3%
Missing3814093
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:26.828652image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length10
Mean length8.333333333
Min length5

Characters and Unicode

Total characters50
Distinct characters24
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)66.7%

Sample

1st rowOchrophyta
2nd rowWGS84
3rd rowWGS84
4th rowRhodophyta
5th rowUnited States
ValueCountFrequency (%)
wgs84 2
28.6%
ochrophyta 1
14.3%
rhodophyta 1
14.3%
united 1
14.3%
states 1
14.3%
plantae 1
14.3%
2025-01-14T11:39:26.950836image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 6
 
12.0%
a 5
 
10.0%
h 4
 
8.0%
S 3
 
6.0%
e 3
 
6.0%
o 3
 
6.0%
y 2
 
4.0%
n 2
 
4.0%
d 2
 
4.0%
G 2
 
4.0%
Other values (14) 18
36.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 34
68.0%
Uppercase Letter 11
 
22.0%
Decimal Number 4
 
8.0%
Space Separator 1
 
2.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 6
17.6%
a 5
14.7%
h 4
11.8%
e 3
8.8%
o 3
8.8%
y 2
 
5.9%
n 2
 
5.9%
d 2
 
5.9%
p 2
 
5.9%
r 1
 
2.9%
Other values (4) 4
11.8%
Uppercase Letter
ValueCountFrequency (%)
S 3
27.3%
G 2
18.2%
W 2
18.2%
R 1
 
9.1%
O 1
 
9.1%
U 1
 
9.1%
P 1
 
9.1%
Decimal Number
ValueCountFrequency (%)
4 2
50.0%
8 2
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 45
90.0%
Common 5
 
10.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 6
13.3%
a 5
 
11.1%
h 4
 
8.9%
S 3
 
6.7%
e 3
 
6.7%
o 3
 
6.7%
y 2
 
4.4%
n 2
 
4.4%
d 2
 
4.4%
G 2
 
4.4%
Other values (11) 13
28.9%
Common
ValueCountFrequency (%)
4 2
40.0%
8 2
40.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 6
 
12.0%
a 5
 
10.0%
h 4
 
8.0%
S 3
 
6.0%
e 3
 
6.0%
o 3
 
6.0%
y 2
 
4.0%
n 2
 
4.0%
d 2
 
4.0%
G 2
 
4.0%
Other values (14) 18
36.0%
Distinct4
Distinct (%)100.0%
Missing3814095
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:27.012530image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length14
Mean length13.75
Min length12

Characters and Unicode

Total characters55
Distinct characters19
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowPhaeophyceae
2nd rowCampanulaceae
3rd rowFlorideophyceae
4th rowMarchantiophyta
ValueCountFrequency (%)
phaeophyceae 1
25.0%
campanulaceae 1
25.0%
florideophyceae 1
25.0%
marchantiophyta 1
25.0%
2025-01-14T11:39:27.139891image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 10
18.2%
e 8
14.5%
h 5
9.1%
o 4
 
7.3%
p 4
 
7.3%
c 4
 
7.3%
y 3
 
5.5%
t 2
 
3.6%
r 2
 
3.6%
n 2
 
3.6%
Other values (9) 11
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 51
92.7%
Uppercase Letter 4
 
7.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 10
19.6%
e 8
15.7%
h 5
9.8%
o 4
 
7.8%
p 4
 
7.8%
c 4
 
7.8%
y 3
 
5.9%
t 2
 
3.9%
r 2
 
3.9%
n 2
 
3.9%
Other values (5) 7
13.7%
Uppercase Letter
ValueCountFrequency (%)
M 1
25.0%
P 1
25.0%
F 1
25.0%
C 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 55
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 10
18.2%
e 8
14.5%
h 5
9.1%
o 4
 
7.3%
p 4
 
7.3%
c 4
 
7.3%
y 3
 
5.5%
t 2
 
3.6%
r 2
 
3.6%
n 2
 
3.6%
Other values (9) 11
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 55
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 10
18.2%
e 8
14.5%
h 5
9.1%
o 4
 
7.3%
p 4
 
7.3%
c 4
 
7.3%
y 3
 
5.5%
t 2
 
3.6%
r 2
 
3.6%
n 2
 
3.6%
Other values (9) 11
20.0%

identificationRemarks
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing3814095
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:27.197922image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length14.5
Mean length12.5
Min length9

Characters and Unicode

Total characters50
Distinct characters19
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowEctocarpales
2nd rowGigartinales
3rd rowLouisiana
4th rowJungermanniopsida
ValueCountFrequency (%)
ectocarpales 1
25.0%
gigartinales 1
25.0%
louisiana 1
25.0%
jungermanniopsida 1
25.0%
2025-01-14T11:39:27.324584image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
16.0%
i 6
12.0%
n 5
10.0%
s 4
 
8.0%
o 3
 
6.0%
r 3
 
6.0%
e 3
 
6.0%
u 2
 
4.0%
t 2
 
4.0%
p 2
 
4.0%
Other values (9) 12
24.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 46
92.0%
Uppercase Letter 4
 
8.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
17.4%
i 6
13.0%
n 5
10.9%
s 4
8.7%
o 3
 
6.5%
r 3
 
6.5%
e 3
 
6.5%
u 2
 
4.3%
t 2
 
4.3%
p 2
 
4.3%
Other values (5) 8
17.4%
Uppercase Letter
ValueCountFrequency (%)
J 1
25.0%
E 1
25.0%
L 1
25.0%
G 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 50
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
16.0%
i 6
12.0%
n 5
10.0%
s 4
 
8.0%
o 3
 
6.0%
r 3
 
6.0%
e 3
 
6.0%
u 2
 
4.0%
t 2
 
4.0%
p 2
 
4.0%
Other values (9) 12
24.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
16.0%
i 6
12.0%
n 5
10.0%
s 4
 
8.0%
o 3
 
6.0%
r 3
 
6.0%
e 3
 
6.0%
u 2
 
4.0%
t 2
 
4.0%
p 2
 
4.0%
Other values (9) 12
24.0%

taxonID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:27.375536image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters12
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowMetzgeriales
ValueCountFrequency (%)
metzgeriales 1
100.0%
2025-01-14T11:39:27.477814image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3
25.0%
M 1
 
8.3%
t 1
 
8.3%
z 1
 
8.3%
g 1
 
8.3%
r 1
 
8.3%
i 1
 
8.3%
a 1
 
8.3%
l 1
 
8.3%
s 1
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11
91.7%
Uppercase Letter 1
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3
27.3%
t 1
 
9.1%
z 1
 
9.1%
g 1
 
9.1%
r 1
 
9.1%
i 1
 
9.1%
a 1
 
9.1%
l 1
 
9.1%
s 1
 
9.1%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3
25.0%
M 1
 
8.3%
t 1
 
8.3%
z 1
 
8.3%
g 1
 
8.3%
r 1
 
8.3%
i 1
 
8.3%
a 1
 
8.3%
l 1
 
8.3%
s 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3
25.0%
M 1
 
8.3%
t 1
 
8.3%
z 1
 
8.3%
g 1
 
8.3%
r 1
 
8.3%
i 1
 
8.3%
a 1
 
8.3%
l 1
 
8.3%
s 1
 
8.3%

scientificNameID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:27.530617image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length16.5
Mean length16.5
Min length16

Characters and Unicode

Total characters33
Distinct characters16
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowScytosiphonaceae
2nd rowRhizophyllidaceae
ValueCountFrequency (%)
scytosiphonaceae 1
50.0%
rhizophyllidaceae 1
50.0%
2025-01-14T11:39:27.642637image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
12.1%
e 4
12.1%
c 3
9.1%
o 3
9.1%
i 3
9.1%
h 3
9.1%
y 2
 
6.1%
p 2
 
6.1%
l 2
 
6.1%
S 1
 
3.0%
Other values (6) 6
18.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 31
93.9%
Uppercase Letter 2
 
6.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
12.9%
e 4
12.9%
c 3
9.7%
o 3
9.7%
i 3
9.7%
h 3
9.7%
y 2
6.5%
p 2
6.5%
l 2
6.5%
t 1
 
3.2%
Other values (4) 4
12.9%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
R 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 33
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
12.1%
e 4
12.1%
c 3
9.1%
o 3
9.1%
i 3
9.1%
h 3
9.1%
y 2
 
6.1%
p 2
 
6.1%
l 2
 
6.1%
S 1
 
3.0%
Other values (6) 6
18.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
12.1%
e 4
12.1%
c 3
9.1%
o 3
9.1%
i 3
9.1%
h 3
9.1%
y 2
 
6.1%
p 2
 
6.1%
l 2
 
6.1%
S 1
 
3.0%
Other values (6) 6
18.2%

acceptedNameUsageID
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing3814096
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:27.693170image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length9
Min length8

Characters and Unicode

Total characters27
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowCampanula
2nd rowRaceland
3rd rowAneuraceae
ValueCountFrequency (%)
campanula 1
33.3%
raceland 1
33.3%
aneuraceae 1
33.3%
2025-01-14T11:39:27.813556image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 7
25.9%
e 4
14.8%
n 3
11.1%
u 2
 
7.4%
l 2
 
7.4%
c 2
 
7.4%
C 1
 
3.7%
m 1
 
3.7%
p 1
 
3.7%
R 1
 
3.7%
Other values (3) 3
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 24
88.9%
Uppercase Letter 3
 
11.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 7
29.2%
e 4
16.7%
n 3
12.5%
u 2
 
8.3%
l 2
 
8.3%
c 2
 
8.3%
m 1
 
4.2%
p 1
 
4.2%
d 1
 
4.2%
r 1
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
C 1
33.3%
R 1
33.3%
A 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 27
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 7
25.9%
e 4
14.8%
n 3
11.1%
u 2
 
7.4%
l 2
 
7.4%
c 2
 
7.4%
C 1
 
3.7%
m 1
 
3.7%
p 1
 
3.7%
R 1
 
3.7%
Other values (3) 3
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 7
25.9%
e 4
14.8%
n 3
11.1%
u 2
 
7.4%
l 2
 
7.4%
c 2
 
7.4%
C 1
 
3.7%
m 1
 
3.7%
p 1
 
3.7%
R 1
 
3.7%
Other values (3) 3
11.1%

parentNameUsageID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:27.874389image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length68
Median length68
Mean length68
Min length68

Characters and Unicode

Total characters68
Distinct characters21
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPlantae, Dicotyledonae (basal), Magnoliales, Annonaceae, Annonoideae
ValueCountFrequency (%)
plantae 1
16.7%
dicotyledonae 1
16.7%
basal 1
16.7%
magnoliales 1
16.7%
annonaceae 1
16.7%
annonoideae 1
16.7%
2025-01-14T11:39:27.988060image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 10
14.7%
n 9
13.2%
e 8
11.8%
o 6
8.8%
5
 
7.4%
l 5
 
7.4%
, 4
 
5.9%
i 3
 
4.4%
c 2
 
2.9%
s 2
 
2.9%
Other values (11) 14
20.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 52
76.5%
Space Separator 5
 
7.4%
Uppercase Letter 5
 
7.4%
Other Punctuation 4
 
5.9%
Open Punctuation 1
 
1.5%
Close Punctuation 1
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 10
19.2%
n 9
17.3%
e 8
15.4%
o 6
11.5%
l 5
9.6%
i 3
 
5.8%
c 2
 
3.8%
s 2
 
3.8%
d 2
 
3.8%
t 2
 
3.8%
Other values (3) 3
 
5.8%
Uppercase Letter
ValueCountFrequency (%)
A 2
40.0%
D 1
20.0%
M 1
20.0%
P 1
20.0%
Space Separator
ValueCountFrequency (%)
5
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 57
83.8%
Common 11
 
16.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 10
17.5%
n 9
15.8%
e 8
14.0%
o 6
10.5%
l 5
8.8%
i 3
 
5.3%
c 2
 
3.5%
s 2
 
3.5%
d 2
 
3.5%
A 2
 
3.5%
Other values (7) 8
14.0%
Common
ValueCountFrequency (%)
5
45.5%
, 4
36.4%
( 1
 
9.1%
) 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 68
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 10
14.7%
n 9
13.2%
e 8
11.8%
o 6
8.8%
5
 
7.4%
l 5
 
7.4%
, 4
 
5.9%
i 3
 
4.4%
c 2
 
2.9%
s 2
 
2.9%
Other values (11) 14
20.6%

originalNameUsageID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:28.037727image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPlantae
ValueCountFrequency (%)
plantae 1
100.0%
2025-01-14T11:39:28.136081image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
28.6%
P 1
14.3%
l 1
14.3%
n 1
14.3%
t 1
14.3%
e 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
85.7%
Uppercase Letter 1
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
33.3%
l 1
16.7%
n 1
16.7%
t 1
16.7%
e 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
28.6%
P 1
14.3%
l 1
14.3%
n 1
14.3%
t 1
14.3%
e 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
28.6%
P 1
14.3%
l 1
14.3%
n 1
14.3%
t 1
14.3%
e 1
14.3%

nameAccordingToID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:28.184306image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length9
Min length8

Characters and Unicode

Total characters18
Distinct characters15
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowColpomenia
2nd rowOchtodes
ValueCountFrequency (%)
colpomenia 1
50.0%
ochtodes 1
50.0%
2025-01-14T11:39:28.310317image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 3
16.7%
e 2
 
11.1%
C 1
 
5.6%
l 1
 
5.6%
p 1
 
5.6%
m 1
 
5.6%
n 1
 
5.6%
i 1
 
5.6%
a 1
 
5.6%
O 1
 
5.6%
Other values (5) 5
27.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16
88.9%
Uppercase Letter 2
 
11.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 3
18.8%
e 2
12.5%
l 1
 
6.2%
p 1
 
6.2%
m 1
 
6.2%
n 1
 
6.2%
i 1
 
6.2%
a 1
 
6.2%
c 1
 
6.2%
h 1
 
6.2%
Other values (3) 3
18.8%
Uppercase Letter
ValueCountFrequency (%)
C 1
50.0%
O 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 3
16.7%
e 2
 
11.1%
C 1
 
5.6%
l 1
 
5.6%
p 1
 
5.6%
m 1
 
5.6%
n 1
 
5.6%
i 1
 
5.6%
a 1
 
5.6%
O 1
 
5.6%
Other values (5) 5
27.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 3
16.7%
e 2
 
11.1%
C 1
 
5.6%
l 1
 
5.6%
p 1
 
5.6%
m 1
 
5.6%
n 1
 
5.6%
i 1
 
5.6%
a 1
 
5.6%
O 1
 
5.6%
Other values (5) 5
27.8%

namePublishedInID
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing3814096
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:28.363779image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length12
Mean length14
Min length9

Characters and Unicode

Total characters42
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowrotundifolia
2nd rowDicotyledonae (basal)
3rd rowRiccardia
ValueCountFrequency (%)
rotundifolia 1
25.0%
dicotyledonae 1
25.0%
basal 1
25.0%
riccardia 1
25.0%
2025-01-14T11:39:28.483108image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 6
14.3%
i 5
11.9%
o 4
 
9.5%
d 3
 
7.1%
l 3
 
7.1%
c 3
 
7.1%
r 2
 
4.8%
t 2
 
4.8%
n 2
 
4.8%
e 2
 
4.8%
Other values (10) 10
23.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 37
88.1%
Uppercase Letter 2
 
4.8%
Open Punctuation 1
 
2.4%
Close Punctuation 1
 
2.4%
Space Separator 1
 
2.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6
16.2%
i 5
13.5%
o 4
10.8%
d 3
8.1%
l 3
8.1%
c 3
8.1%
r 2
 
5.4%
t 2
 
5.4%
n 2
 
5.4%
e 2
 
5.4%
Other values (5) 5
13.5%
Uppercase Letter
ValueCountFrequency (%)
D 1
50.0%
R 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 39
92.9%
Common 3
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6
15.4%
i 5
12.8%
o 4
10.3%
d 3
7.7%
l 3
7.7%
c 3
7.7%
r 2
 
5.1%
t 2
 
5.1%
n 2
 
5.1%
e 2
 
5.1%
Other values (7) 7
17.9%
Common
ValueCountFrequency (%)
( 1
33.3%
) 1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6
14.3%
i 5
11.9%
o 4
 
9.5%
d 3
 
7.1%
l 3
 
7.1%
c 3
 
7.1%
r 2
 
4.8%
t 2
 
4.8%
n 2
 
4.8%
e 2
 
4.8%
Other values (10) 10
23.8%

taxonConceptID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:28.535667image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters11
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowMagnoliales
ValueCountFrequency (%)
magnoliales 1
100.0%
2025-01-14T11:39:28.641943image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
18.2%
l 2
18.2%
M 1
9.1%
g 1
9.1%
n 1
9.1%
o 1
9.1%
i 1
9.1%
e 1
9.1%
s 1
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10
90.9%
Uppercase Letter 1
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
20.0%
l 2
20.0%
g 1
10.0%
n 1
10.0%
o 1
10.0%
i 1
10.0%
e 1
10.0%
s 1
10.0%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
18.2%
l 2
18.2%
M 1
9.1%
g 1
9.1%
n 1
9.1%
o 1
9.1%
i 1
9.1%
e 1
9.1%
s 1
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
18.2%
l 2
18.2%
M 1
9.1%
g 1
9.1%
n 1
9.1%
o 1
9.1%
i 1
9.1%
e 1
9.1%
s 1
9.1%

scientificName
Text

Missing 

Distinct498139
Distinct (%)13.6%
Missing152724
Missing (%)4.0%
Memory size29.1 MiB
2025-01-14T11:39:28.939569image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length125
Median length97
Mean length20.18342371
Min length3

Characters and Unicode

Total characters73899083
Distinct characters98
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique254723 ?
Unique (%)7.0%

Sample

1st rowLesquerella lescurii
2nd rowDesmognathus ochrophaeus
3rd rowNinoe kinbergi
4th rowGomphus adelphus
5th rowSkrjabinoclava catoptrophori
ValueCountFrequency (%)
sp 224114
 
2.8%
var 87003
 
1.1%
plethodon 69434
 
0.9%
subsp 43445
 
0.5%
cinereus 35438
 
0.4%
bombus 28778
 
0.4%
carex 23618
 
0.3%
indet 17121
 
0.2%
peromyscus 16160
 
0.2%
desmognathus 14838
 
0.2%
Other values (211783) 7432205
93.0%
2025-01-14T11:39:29.331545image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8143176
 
11.0%
i 6697207
 
9.1%
s 5452327
 
7.4%
e 4860883
 
6.6%
o 4602351
 
6.2%
r 4549912
 
6.2%
4330779
 
5.9%
u 3993086
 
5.4%
l 3969942
 
5.4%
n 3867669
 
5.2%
Other values (88) 23431751
31.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 65218673
88.3%
Space Separator 4330779
 
5.9%
Uppercase Letter 3761747
 
5.1%
Other Punctuation 400495
 
0.5%
Open Punctuation 87330
 
0.1%
Close Punctuation 87329
 
0.1%
Dash Punctuation 9209
 
< 0.1%
Decimal Number 3384
 
< 0.1%
Connector Punctuation 120
 
< 0.1%
Math Symbol 16
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8143176
12.5%
i 6697207
10.3%
s 5452327
 
8.4%
e 4860883
 
7.5%
o 4602351
 
7.1%
r 4549912
 
7.0%
u 3993086
 
6.1%
l 3969942
 
6.1%
n 3867669
 
5.9%
t 3451881
 
5.3%
Other values (27) 15630239
24.0%
Uppercase Letter
ValueCountFrequency (%)
P 560260
14.9%
C 497486
13.2%
A 346053
 
9.2%
S 335540
 
8.9%
M 252876
 
6.7%
L 201820
 
5.4%
E 192599
 
5.1%
T 180945
 
4.8%
D 169789
 
4.5%
B 158459
 
4.2%
Other values (18) 865920
23.0%
Other Punctuation
ValueCountFrequency (%)
. 391387
97.7%
" 3810
 
1.0%
, 2144
 
0.5%
' 1809
 
0.5%
& 937
 
0.2%
? 279
 
0.1%
/ 108
 
< 0.1%
# 18
 
< 0.1%
! 1
 
< 0.1%
1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 974
28.8%
1 802
23.7%
0 734
21.7%
5 426
12.6%
9 114
 
3.4%
8 90
 
2.7%
3 84
 
2.5%
7 64
 
1.9%
4 53
 
1.6%
6 43
 
1.3%
Math Symbol
ValueCountFrequency (%)
× 8
50.0%
+ 5
31.2%
~ 2
 
12.5%
= 1
 
6.2%
Open Punctuation
ValueCountFrequency (%)
( 87296
> 99.9%
[ 34
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 87295
> 99.9%
] 34
 
< 0.1%
Space Separator
ValueCountFrequency (%)
4330779
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9209
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 120
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 68980420
93.3%
Common 4918663
 
6.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8143176
11.8%
i 6697207
 
9.7%
s 5452327
 
7.9%
e 4860883
 
7.0%
o 4602351
 
6.7%
r 4549912
 
6.6%
u 3993086
 
5.8%
l 3969942
 
5.8%
n 3867669
 
5.6%
t 3451881
 
5.0%
Other values (55) 19391986
28.1%
Common
ValueCountFrequency (%)
4330779
88.0%
. 391387
 
8.0%
( 87296
 
1.8%
) 87295
 
1.8%
- 9209
 
0.2%
" 3810
 
0.1%
, 2144
 
< 0.1%
' 1809
 
< 0.1%
2 974
 
< 0.1%
& 937
 
< 0.1%
Other values (23) 3023
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 73898592
> 99.9%
None 489
 
< 0.1%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8143176
 
11.0%
i 6697207
 
9.1%
s 5452327
 
7.4%
e 4860883
 
6.6%
o 4602351
 
6.2%
r 4549912
 
6.2%
4330779
 
5.9%
u 3993086
 
5.4%
l 3969942
 
5.4%
n 3867669
 
5.2%
Other values (72) 23431260
31.7%
None
ValueCountFrequency (%)
ë 292
59.7%
ö 51
 
10.4%
á 45
 
9.2%
ü 40
 
8.2%
Á 20
 
4.1%
é 15
 
3.1%
ó 8
 
1.6%
× 8
 
1.6%
É 4
 
0.8%
ñ 2
 
0.4%
Other values (4) 4
 
0.8%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%

acceptedNameUsage
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing3814096
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:29.394308image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length6.666666667
Min length3

Characters and Unicode

Total characters20
Distinct characters11
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowsinuosa
2nd rowAnnonaceae
3rd rowsp.
ValueCountFrequency (%)
sinuosa 1
33.3%
annonaceae 1
33.3%
sp 1
33.3%
2025-01-14T11:39:29.510569image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 4
20.0%
s 3
15.0%
a 3
15.0%
o 2
10.0%
e 2
10.0%
i 1
 
5.0%
u 1
 
5.0%
A 1
 
5.0%
c 1
 
5.0%
p 1
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18
90.0%
Uppercase Letter 1
 
5.0%
Other Punctuation 1
 
5.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 4
22.2%
s 3
16.7%
a 3
16.7%
o 2
11.1%
e 2
11.1%
i 1
 
5.6%
u 1
 
5.6%
c 1
 
5.6%
p 1
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19
95.0%
Common 1
 
5.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 4
21.1%
s 3
15.8%
a 3
15.8%
o 2
10.5%
e 2
10.5%
i 1
 
5.3%
u 1
 
5.3%
A 1
 
5.3%
c 1
 
5.3%
p 1
 
5.3%
Common
ValueCountFrequency (%)
. 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 4
20.0%
s 3
15.0%
a 3
15.0%
o 2
10.0%
e 2
10.0%
i 1
 
5.0%
u 1
 
5.0%
A 1
 
5.0%
c 1
 
5.0%
p 1
 
5.0%

parentNameUsage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:29.555549image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowpinguis
ValueCountFrequency (%)
pinguis 1
100.0%
2025-01-14T11:39:29.655351image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2
28.6%
p 1
14.3%
n 1
14.3%
g 1
14.3%
u 1
14.3%
s 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2
28.6%
p 1
14.3%
n 1
14.3%
g 1
14.3%
u 1
14.3%
s 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 2
28.6%
p 1
14.3%
n 1
14.3%
g 1
14.3%
u 1
14.3%
s 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2
28.6%
p 1
14.3%
n 1
14.3%
g 1
14.3%
u 1
14.3%
s 1
14.3%

originalNameUsage
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:29.701916image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length5.5
Mean length5.5
Min length2

Characters and Unicode

Total characters11
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowGEOLocate
2nd rowL.
ValueCountFrequency (%)
geolocate 1
50.0%
l 1
50.0%
2025-01-14T11:39:29.812240image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
L 2
18.2%
G 1
9.1%
E 1
9.1%
O 1
9.1%
o 1
9.1%
c 1
9.1%
a 1
9.1%
t 1
9.1%
e 1
9.1%
. 1
9.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 5
45.5%
Lowercase Letter 5
45.5%
Other Punctuation 1
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1
20.0%
c 1
20.0%
a 1
20.0%
t 1
20.0%
e 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
L 2
40.0%
G 1
20.0%
E 1
20.0%
O 1
20.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
90.9%
Common 1
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 2
20.0%
G 1
10.0%
E 1
10.0%
O 1
10.0%
o 1
10.0%
c 1
10.0%
a 1
10.0%
t 1
10.0%
e 1
10.0%
Common
ValueCountFrequency (%)
. 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L 2
18.2%
G 1
9.1%
E 1
9.1%
O 1
9.1%
o 1
9.1%
c 1
9.1%
a 1
9.1%
t 1
9.1%
e 1
9.1%
. 1
9.1%

namePublishedIn
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:29.859385image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowGuatteria
ValueCountFrequency (%)
guatteria 1
100.0%
2025-01-14T11:39:29.961112image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
22.2%
t 2
22.2%
G 1
11.1%
u 1
11.1%
e 1
11.1%
r 1
11.1%
i 1
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8
88.9%
Uppercase Letter 1
 
11.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
25.0%
t 2
25.0%
u 1
12.5%
e 1
12.5%
r 1
12.5%
i 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
G 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
22.2%
t 2
22.2%
G 1
11.1%
u 1
11.1%
e 1
11.1%
r 1
11.1%
i 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
22.2%
t 2
22.2%
G 1
11.1%
u 1
11.1%
e 1
11.1%
r 1
11.1%
i 1
11.1%

namePublishedInYear
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:30.013875image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length34
Median length34
Mean length34
Min length34

Characters and Unicode

Total characters34
Distinct characters20
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row(K. Mert. ex Roth) Derbes & Solier
ValueCountFrequency (%)
k 1
14.3%
mert 1
14.3%
ex 1
14.3%
roth 1
14.3%
derbes 1
14.3%
1
14.3%
solier 1
14.3%
2025-01-14T11:39:30.123087image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6
17.6%
e 5
14.7%
r 3
 
8.8%
o 2
 
5.9%
. 2
 
5.9%
t 2
 
5.9%
D 1
 
2.9%
l 1
 
2.9%
S 1
 
2.9%
& 1
 
2.9%
Other values (10) 10
29.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18
52.9%
Space Separator 6
 
17.6%
Uppercase Letter 5
 
14.7%
Other Punctuation 3
 
8.8%
Open Punctuation 1
 
2.9%
Close Punctuation 1
 
2.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5
27.8%
r 3
16.7%
o 2
 
11.1%
t 2
 
11.1%
l 1
 
5.6%
s 1
 
5.6%
b 1
 
5.6%
h 1
 
5.6%
x 1
 
5.6%
i 1
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
D 1
20.0%
S 1
20.0%
K 1
20.0%
R 1
20.0%
M 1
20.0%
Other Punctuation
ValueCountFrequency (%)
. 2
66.7%
& 1
33.3%
Space Separator
ValueCountFrequency (%)
6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23
67.6%
Common 11
32.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5
21.7%
r 3
13.0%
o 2
 
8.7%
t 2
 
8.7%
D 1
 
4.3%
l 1
 
4.3%
S 1
 
4.3%
s 1
 
4.3%
b 1
 
4.3%
h 1
 
4.3%
Other values (5) 5
21.7%
Common
ValueCountFrequency (%)
6
54.5%
. 2
 
18.2%
& 1
 
9.1%
( 1
 
9.1%
) 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6
17.6%
e 5
14.7%
r 3
 
8.8%
o 2
 
5.9%
. 2
 
5.9%
t 2
 
5.9%
D 1
 
2.9%
l 1
 
2.9%
S 1
 
2.9%
& 1
 
2.9%
Other values (10) 10
29.4%
Distinct10142
Distinct (%)0.3%
Missing8025
Missing (%)0.2%
Memory size29.1 MiB
2025-01-14T11:39:30.316553image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length164
Median length148
Mean length65.02585814
Min length6

Characters and Unicode

Total characters247493228
Distinct characters73
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1548 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia, Arthropoda, Crustacea, Malacostraca, Eumalacostraca, Eucarida, Decapoda, Pleocyemata, Hippolytidae
2nd rowPlantae, Dicotyledonae, Brassicales, Brassicaceae, Brassicoideae
3rd rowAnimalia, Chordata, Vertebrata, Amphibia, Caudata, Plethodontidae
4th rowAnimalia, Cnidaria, Anthozoa, Hexacorallia, Scleractinia
5th rowAnimalia, Annelida, Polychaeta, Errantia, Eunicida, Lumbrineridae
ValueCountFrequency (%)
animalia 1953363
 
9.1%
plantae 1703265
 
7.9%
dicotyledonae 1061726
 
4.9%
chordata 924299
 
4.3%
vertebrata 915878
 
4.3%
arthropoda 407833
 
1.9%
monocotyledonae 373594
 
1.7%
mollusca 356697
 
1.7%
poales 288260
 
1.3%
gastropoda 251919
 
1.2%
Other values (10186) 13296608
61.7%
2025-01-14T11:39:30.611590image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 34476529
13.9%
e 24748170
 
10.0%
i 18023886
 
7.3%
17727368
 
7.2%
, 17673090
 
7.1%
o 15704488
 
6.3%
t 13361876
 
5.4%
l 12174850
 
4.9%
r 11460198
 
4.6%
n 10609214
 
4.3%
Other values (63) 71533559
28.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 190534785
77.0%
Uppercase Letter 21479072
 
8.7%
Space Separator 17727368
 
7.2%
Other Punctuation 17681466
 
7.1%
Open Punctuation 35134
 
< 0.1%
Close Punctuation 35134
 
< 0.1%
Dash Punctuation 201
 
< 0.1%
Connector Punctuation 51
 
< 0.1%
Decimal Number 16
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 34476529
18.1%
e 24748170
13.0%
i 18023886
9.5%
o 15704488
8.2%
t 13361876
 
7.0%
l 12174850
 
6.4%
r 11460198
 
6.0%
n 10609214
 
5.6%
d 8725587
 
4.6%
c 8272856
 
4.3%
Other values (17) 32977131
17.3%
Uppercase Letter
ValueCountFrequency (%)
A 4453774
20.7%
P 3956585
18.4%
C 2595797
12.1%
M 1813334
8.4%
D 1362205
 
6.3%
V 1021212
 
4.8%
E 864737
 
4.0%
S 853251
 
4.0%
L 556792
 
2.6%
R 552324
 
2.6%
Other values (16) 3449061
16.1%
Decimal Number
ValueCountFrequency (%)
6 3
18.8%
2 2
12.5%
9 2
12.5%
7 2
12.5%
0 2
12.5%
1 2
12.5%
3 2
12.5%
4 1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
, 17673090
> 99.9%
. 8341
 
< 0.1%
? 24
 
< 0.1%
/ 11
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 35072
99.8%
[ 62
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 35072
99.8%
] 62
 
0.2%
Space Separator
ValueCountFrequency (%)
17727368
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 201
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 51
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 212013857
85.7%
Common 35479371
 
14.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 34476529
16.3%
e 24748170
11.7%
i 18023886
 
8.5%
o 15704488
 
7.4%
t 13361876
 
6.3%
l 12174850
 
5.7%
r 11460198
 
5.4%
n 10609214
 
5.0%
d 8725587
 
4.1%
c 8272856
 
3.9%
Other values (43) 54456203
25.7%
Common
ValueCountFrequency (%)
17727368
50.0%
, 17673090
49.8%
( 35072
 
0.1%
) 35072
 
0.1%
. 8341
 
< 0.1%
- 201
 
< 0.1%
[ 62
 
< 0.1%
] 62
 
< 0.1%
_ 51
 
< 0.1%
? 24
 
< 0.1%
Other values (10) 28
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 247492967
> 99.9%
None 261
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 34476529
13.9%
e 24748170
 
10.0%
i 18023886
 
7.3%
17727368
 
7.2%
, 17673090
 
7.1%
o 15704488
 
6.3%
t 13361876
 
5.4%
l 12174850
 
4.9%
r 11460198
 
4.6%
n 10609214
 
4.3%
Other values (62) 71533298
28.9%
None
ValueCountFrequency (%)
ö 261
100.0%
Distinct16
Distinct (%)< 0.1%
Missing10040
Missing (%)0.3%
Memory size29.1 MiB
2025-01-14T11:39:30.674003image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length33
Median length8
Mean length7.495749146
Min length5

Characters and Unicode

Total characters28514272
Distinct characters33
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia
2nd rowPlantae
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 1953363
51.3%
plantae 1703246
44.8%
fungi 91813
 
2.4%
eubacteria 21587
 
0.6%
chromista 17285
 
0.5%
protista 15845
 
0.4%
protozoa 896
 
< 0.1%
bacteria 9
 
< 0.1%
animalis 6
 
< 0.1%
animala 4
 
< 0.1%
Other values (7) 9
 
< 0.1%
2025-01-14T11:39:30.785714image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 7390458
25.9%
i 4053285
14.2%
n 3748435
13.1%
l 3656619
12.8%
m 1970659
 
6.9%
A 1953373
 
6.9%
t 1774717
 
6.2%
e 1724848
 
6.0%
P 1719986
 
6.0%
u 113402
 
0.4%
Other values (23) 408490
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 24710205
86.7%
Uppercase Letter 3804056
 
13.3%
Decimal Number 5
 
< 0.1%
Space Separator 4
 
< 0.1%
Dash Punctuation 1
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 7390458
29.9%
i 4053285
16.4%
n 3748435
15.2%
l 3656619
14.8%
m 1970659
 
8.0%
t 1774717
 
7.2%
e 1724848
 
7.0%
u 113402
 
0.5%
g 91814
 
0.4%
r 55628
 
0.2%
Other values (10) 130340
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
A 1953373
51.3%
P 1719986
45.2%
F 91813
 
2.4%
E 21589
 
0.6%
C 17285
 
0.5%
B 9
 
< 0.1%
I 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
9 3
60.0%
0 1
 
20.0%
5 1
 
20.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 28514261
> 99.9%
Common 11
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 7390458
25.9%
i 4053285
14.2%
n 3748435
13.1%
l 3656619
12.8%
m 1970659
 
6.9%
A 1953373
 
6.9%
t 1774717
 
6.2%
e 1724848
 
6.0%
P 1719986
 
6.0%
u 113402
 
0.4%
Other values (17) 408479
 
1.4%
Common
ValueCountFrequency (%)
4
36.4%
9 3
27.3%
- 1
 
9.1%
0 1
 
9.1%
. 1
 
9.1%
5 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28514272
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 7390458
25.9%
i 4053285
14.2%
n 3748435
13.1%
l 3656619
12.8%
m 1970659
 
6.9%
A 1953373
 
6.9%
t 1774717
 
6.2%
e 1724848
 
6.0%
P 1719986
 
6.0%
u 113402
 
0.4%
Other values (23) 408490
 
1.4%

phylum
Text

Missing 

Distinct106
Distinct (%)< 0.1%
Missing1562087
Missing (%)41.0%
Memory size29.1 MiB
2025-01-14T11:39:30.856197image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length8
Mean length8.845462635
Min length5

Characters and Unicode

Total characters19920088
Distinct characters52
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)< 0.1%

Sample

1st rowArthropoda
2nd rowChordata
3rd rowCnidaria
4th rowAnnelida
5th rowArthropoda
ValueCountFrequency (%)
chordata 924299
41.0%
arthropoda 407833
18.1%
mollusca 356697
 
15.8%
annelida 99290
 
4.4%
ascomycota 90632
 
4.0%
bryophyta 61205
 
2.7%
rhodophyta 50004
 
2.2%
cnidaria 48058
 
2.1%
echinodermata 37484
 
1.7%
nematoda 28248
 
1.3%
Other values (105) 149216
 
6.6%
2025-01-14T11:39:30.996535image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3392588
17.0%
o 2654365
13.3%
r 2009507
10.1%
t 1753345
8.8%
h 1694517
8.5%
d 1599243
8.0%
C 1015241
 
5.1%
l 905045
 
4.5%
c 648529
 
3.3%
A 600581
 
3.0%
Other values (42) 3647127
18.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17666309
88.7%
Uppercase Letter 2251999
 
11.3%
Space Separator 954
 
< 0.1%
Other Punctuation 660
 
< 0.1%
Dash Punctuation 117
 
< 0.1%
Connector Punctuation 47
 
< 0.1%
Decimal Number 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3392588
19.2%
o 2654365
15.0%
r 2009507
11.4%
t 1753345
9.9%
h 1694517
9.6%
d 1599243
9.1%
l 905045
 
5.1%
c 648529
 
3.7%
p 597572
 
3.4%
s 469272
 
2.7%
Other values (14) 1942326
11.0%
Uppercase Letter
ValueCountFrequency (%)
C 1015241
45.1%
A 600581
26.7%
M 370202
 
16.4%
B 79873
 
3.5%
R 50413
 
2.2%
P 43753
 
1.9%
E 37618
 
1.7%
N 30920
 
1.4%
O 12211
 
0.5%
S 4597
 
0.2%
Other values (12) 6590
 
0.3%
Decimal Number
ValueCountFrequency (%)
8 1
50.0%
4 1
50.0%
Space Separator
ValueCountFrequency (%)
954
100.0%
Other Punctuation
ValueCountFrequency (%)
. 660
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 117
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 47
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19918308
> 99.9%
Common 1780
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3392588
17.0%
o 2654365
13.3%
r 2009507
10.1%
t 1753345
8.8%
h 1694517
8.5%
d 1599243
8.0%
C 1015241
 
5.1%
l 905045
 
4.5%
c 648529
 
3.3%
A 600581
 
3.0%
Other values (36) 3645347
18.3%
Common
ValueCountFrequency (%)
954
53.6%
. 660
37.1%
- 117
 
6.6%
_ 47
 
2.6%
8 1
 
0.1%
4 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19920088
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3392588
17.0%
o 2654365
13.3%
r 2009507
10.1%
t 1753345
8.8%
h 1694517
8.5%
d 1599243
8.0%
C 1015241
 
5.1%
l 905045
 
4.5%
c 648529
 
3.3%
A 600581
 
3.0%
Other values (42) 3647127
18.3%

class
Text

Missing 

Distinct225
Distinct (%)< 0.1%
Missing102065
Missing (%)2.7%
Memory size29.1 MiB
2025-01-14T11:39:31.171521image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length33
Median length20
Mean length11.06579573
Min length4

Characters and Unicode

Total characters41076610
Distinct characters51
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)< 0.1%

Sample

1st rowMalacostraca
2nd rowDicotyledonae
3rd rowAmphibia
4th rowAnthozoa
5th rowPolychaeta
ValueCountFrequency (%)
dicotyledonae 1061725
28.3%
monocotyledonae 373594
 
10.0%
gastropoda 251919
 
6.7%
mammalia 247286
 
6.6%
insecta 242424
 
6.5%
aves 240577
 
6.4%
actinopterygii 183425
 
4.9%
amphibia 162685
 
4.3%
malacostraca 124107
 
3.3%
pteridophyte 113570
 
3.0%
Other values (216) 746222
19.9%
2025-01-14T11:39:31.421627image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 5341978
13.0%
a 4658977
11.3%
e 4540797
11.1%
t 3012097
 
7.3%
i 2919628
 
7.1%
c 2508892
 
6.1%
n 2461418
 
6.0%
l 2238371
 
5.4%
d 2073116
 
5.0%
y 2053582
 
5.0%
Other values (41) 9267754
22.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 37258936
90.7%
Uppercase Letter 3712033
 
9.0%
Space Separator 35500
 
0.1%
Open Punctuation 35023
 
0.1%
Close Punctuation 35023
 
0.1%
Other Punctuation 95
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 5341978
14.3%
a 4658977
12.5%
e 4540797
12.2%
t 3012097
8.1%
i 2919628
7.8%
c 2508892
6.7%
n 2461418
6.6%
l 2238371
6.0%
d 2073116
 
5.6%
y 2053582
 
5.5%
Other values (15) 5450080
14.6%
Uppercase Letter
ValueCountFrequency (%)
D 1074606
28.9%
M 770943
20.8%
A 656642
17.7%
G 252003
 
6.8%
I 243001
 
6.5%
P 228128
 
6.1%
B 142983
 
3.9%
L 83972
 
2.3%
R 78213
 
2.1%
C 39792
 
1.1%
Other values (12) 141750
 
3.8%
Space Separator
ValueCountFrequency (%)
35500
100.0%
Open Punctuation
ValueCountFrequency (%)
( 35023
100.0%
Close Punctuation
ValueCountFrequency (%)
) 35023
100.0%
Other Punctuation
ValueCountFrequency (%)
. 95
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 40970969
99.7%
Common 105641
 
0.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 5341978
13.0%
a 4658977
11.4%
e 4540797
11.1%
t 3012097
 
7.4%
i 2919628
 
7.1%
c 2508892
 
6.1%
n 2461418
 
6.0%
l 2238371
 
5.5%
d 2073116
 
5.1%
y 2053582
 
5.0%
Other values (37) 9162113
22.4%
Common
ValueCountFrequency (%)
35500
33.6%
( 35023
33.2%
) 35023
33.2%
. 95
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 41076610
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 5341978
13.0%
a 4658977
11.3%
e 4540797
11.1%
t 3012097
 
7.3%
i 2919628
 
7.1%
c 2508892
 
6.1%
n 2461418
 
6.0%
l 2238371
 
5.4%
d 2073116
 
5.0%
y 2053582
 
5.0%
Other values (41) 9267754
22.6%

order
Text

Missing 

Distinct978
Distinct (%)< 0.1%
Missing410734
Missing (%)10.8%
Memory size29.1 MiB
2025-01-14T11:39:31.610765image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length22
Mean length9.636402502
Min length5

Characters and Unicode

Total characters32796195
Distinct characters55
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique97 ?
Unique (%)< 0.1%

Sample

1st rowDecapoda
2nd rowBrassicales
3rd rowCaudata
4th rowScleractinia
5th rowEunicida
ValueCountFrequency (%)
poales 288260
 
8.5%
asterales 156438
 
4.6%
passeriformes 152902
 
4.5%
rodentia 122495
 
3.6%
lamiales 109533
 
3.2%
fabales 104514
 
3.1%
caudata 97702
 
2.9%
perciformes 88214
 
2.6%
malpighiales 86512
 
2.5%
decapoda 80987
 
2.4%
Other values (968) 2116574
62.2%
2025-01-14T11:39:31.868258image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4889746
14.9%
e 3854264
11.8%
s 3030750
 
9.2%
l 2735926
 
8.3%
o 2274535
 
6.9%
i 2201734
 
6.7%
r 2102567
 
6.4%
t 1244556
 
3.8%
n 1059210
 
3.2%
p 999396
 
3.0%
Other values (45) 8403511
25.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 29389601
89.6%
Uppercase Letter 3403303
 
10.4%
Other Punctuation 2403
 
< 0.1%
Space Separator 766
 
< 0.1%
Open Punctuation 61
 
< 0.1%
Close Punctuation 61
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4889746
16.6%
e 3854264
13.1%
s 3030750
10.3%
l 2735926
9.3%
o 2274535
7.7%
i 2201734
7.5%
r 2102567
7.2%
t 1244556
 
4.2%
n 1059210
 
3.6%
p 999396
 
3.4%
Other values (16) 4996917
17.0%
Uppercase Letter
ValueCountFrequency (%)
P 745213
21.9%
C 463002
13.6%
A 376082
11.1%
S 270198
 
7.9%
L 240506
 
7.1%
M 214351
 
6.3%
R 205758
 
6.0%
D 147162
 
4.3%
F 130942
 
3.8%
H 121624
 
3.6%
Other values (14) 488465
14.4%
Other Punctuation
ValueCountFrequency (%)
. 2402
> 99.9%
? 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
766
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 61
100.0%
Close Punctuation
ValueCountFrequency (%)
] 61
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32792904
> 99.9%
Common 3291
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4889746
14.9%
e 3854264
11.8%
s 3030750
 
9.2%
l 2735926
 
8.3%
o 2274535
 
6.9%
i 2201734
 
6.7%
r 2102567
 
6.4%
t 1244556
 
3.8%
n 1059210
 
3.2%
p 999396
 
3.0%
Other values (40) 8400220
25.6%
Common
ValueCountFrequency (%)
. 2402
73.0%
766
 
23.3%
[ 61
 
1.9%
] 61
 
1.9%
? 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32796195
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4889746
14.9%
e 3854264
11.8%
s 3030750
 
9.2%
l 2735926
 
8.3%
o 2274535
 
6.9%
i 2201734
 
6.7%
r 2102567
 
6.4%
t 1244556
 
3.8%
n 1059210
 
3.2%
p 999396
 
3.0%
Other values (45) 8403511
25.6%

family
Text

Missing 

Distinct6247
Distinct (%)0.2%
Missing101008
Missing (%)2.6%
Memory size29.1 MiB
2025-01-14T11:39:32.043705image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length38
Median length33
Mean length10.82154545
Min length6

Characters and Unicode

Total characters40181383
Distinct characters64
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique679 ?
Unique (%)< 0.1%

Sample

1st rowHippolytidae
2nd rowBrassicaceae
3rd rowPlethodontidae
4th rowLumbrineridae
5th rowGomphidae
ValueCountFrequency (%)
poaceae 206776
 
5.6%
asteraceae 147421
 
4.0%
fabaceae 97640
 
2.6%
plethodontidae 91218
 
2.5%
cyperaceae 57015
 
1.5%
rubiaceae 49120
 
1.3%
cricetidae 44315
 
1.2%
muridae 38592
 
1.0%
apidae 34263
 
0.9%
melastomataceae 30107
 
0.8%
Other values (6234) 2925182
78.6%
2025-01-14T11:39:32.443720image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 7167192
17.8%
e 7127326
17.7%
i 3783943
9.4%
c 2767017
 
6.9%
d 2343643
 
5.8%
o 1987878
 
4.9%
r 1899670
 
4.7%
l 1560793
 
3.9%
n 1413847
 
3.5%
t 1394290
 
3.5%
Other values (54) 8735784
21.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 36454714
90.7%
Uppercase Letter 3713091
 
9.2%
Space Separator 8558
 
< 0.1%
Other Punctuation 4923
 
< 0.1%
Open Punctuation 41
 
< 0.1%
Close Punctuation 41
 
< 0.1%
Decimal Number 10
 
< 0.1%
Connector Punctuation 4
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 7167192
19.7%
e 7127326
19.6%
i 3783943
10.4%
c 2767017
 
7.6%
d 2343643
 
6.4%
o 1987878
 
5.5%
r 1899670
 
5.2%
l 1560793
 
4.3%
n 1413847
 
3.9%
t 1394290
 
3.8%
Other values (16) 5009115
13.7%
Uppercase Letter
ValueCountFrequency (%)
P 736705
19.8%
C 513178
13.8%
A 429262
11.6%
S 276035
 
7.4%
M 228091
 
6.1%
L 172968
 
4.7%
T 149659
 
4.0%
R 149142
 
4.0%
F 139759
 
3.8%
E 139164
 
3.7%
Other values (16) 779128
21.0%
Decimal Number
ValueCountFrequency (%)
6 3
30.0%
0 2
20.0%
1 2
20.0%
3 2
20.0%
9 1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
. 4914
99.8%
? 9
 
0.2%
Space Separator
ValueCountFrequency (%)
8558
100.0%
Open Punctuation
ValueCountFrequency (%)
( 41
100.0%
Close Punctuation
ValueCountFrequency (%)
) 41
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 40167805
> 99.9%
Common 13578
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 7167192
17.8%
e 7127326
17.7%
i 3783943
9.4%
c 2767017
 
6.9%
d 2343643
 
5.8%
o 1987878
 
4.9%
r 1899670
 
4.7%
l 1560793
 
3.9%
n 1413847
 
3.5%
t 1394290
 
3.5%
Other values (42) 8722206
21.7%
Common
ValueCountFrequency (%)
8558
63.0%
. 4914
36.2%
( 41
 
0.3%
) 41
 
0.3%
? 9
 
0.1%
_ 4
 
< 0.1%
6 3
 
< 0.1%
0 2
 
< 0.1%
1 2
 
< 0.1%
3 2
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 40181383
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 7167192
17.8%
e 7127326
17.7%
i 3783943
9.4%
c 2767017
 
6.9%
d 2343643
 
5.8%
o 1987878
 
4.9%
r 1899670
 
4.7%
l 1560793
 
3.9%
n 1413847
 
3.5%
t 1394290
 
3.5%
Other values (54) 8735784
21.7%

subfamily
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:32.502269image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters19
Distinct characters15
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row(Aubl.) R.A. Howard
ValueCountFrequency (%)
aubl 1
33.3%
r.a 1
33.3%
howard 1
33.3%
2025-01-14T11:39:32.602282image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 3
15.8%
A 2
 
10.5%
2
 
10.5%
( 1
 
5.3%
u 1
 
5.3%
b 1
 
5.3%
l 1
 
5.3%
) 1
 
5.3%
R 1
 
5.3%
H 1
 
5.3%
Other values (5) 5
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8
42.1%
Uppercase Letter 4
21.1%
Other Punctuation 3
 
15.8%
Space Separator 2
 
10.5%
Open Punctuation 1
 
5.3%
Close Punctuation 1
 
5.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 1
12.5%
b 1
12.5%
l 1
12.5%
o 1
12.5%
w 1
12.5%
a 1
12.5%
r 1
12.5%
d 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
A 2
50.0%
R 1
25.0%
H 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
63.2%
Common 7
36.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 2
16.7%
u 1
8.3%
b 1
8.3%
l 1
8.3%
R 1
8.3%
H 1
8.3%
o 1
8.3%
w 1
8.3%
a 1
8.3%
r 1
8.3%
Common
ValueCountFrequency (%)
. 3
42.9%
2
28.6%
( 1
 
14.3%
) 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 3
15.8%
A 2
 
10.5%
2
 
10.5%
( 1
 
5.3%
u 1
 
5.3%
b 1
 
5.3%
l 1
 
5.3%
) 1
 
5.3%
R 1
 
5.3%
H 1
 
5.3%
Other values (5) 5
26.3%

genus
Text

Missing 

Distinct70442
Distinct (%)1.9%
Missing162837
Missing (%)4.3%
Memory size29.1 MiB
2025-01-14T11:39:32.811816image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length25
Mean length8.949369834
Min length1

Characters and Unicode

Total characters32676494
Distinct characters72
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20728 ?
Unique (%)0.6%

Sample

1st rowLesquerella
2nd rowDesmognathus
3rd rowNinoe
4th rowGomphus
5th rowSkrjabinoclava
ValueCountFrequency (%)
plethodon 69419
 
1.9%
bombus 25851
 
0.7%
carex 23618
 
0.6%
peromyscus 16159
 
0.4%
desmognathus 14837
 
0.4%
indet 14188
 
0.4%
poa 12303
 
0.3%
cyperus 11356
 
0.3%
cladonia 11033
 
0.3%
paspalum 10616
 
0.3%
Other values (70416) 3448082
94.3%
2025-01-14T11:39:33.105747image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3570749
 
10.9%
i 2703577
 
8.3%
o 2622759
 
8.0%
e 2270868
 
6.9%
s 2132346
 
6.5%
r 2080887
 
6.4%
l 1789654
 
5.5%
u 1660949
 
5.1%
n 1617510
 
5.0%
t 1546059
 
4.7%
Other values (62) 10681136
32.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 29005224
88.8%
Uppercase Letter 3650636
 
11.2%
Other Punctuation 14221
 
< 0.1%
Space Separator 6200
 
< 0.1%
Open Punctuation 70
 
< 0.1%
Close Punctuation 70
 
< 0.1%
Dash Punctuation 49
 
< 0.1%
Decimal Number 15
 
< 0.1%
Connector Punctuation 8
 
< 0.1%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3570749
12.3%
i 2703577
 
9.3%
o 2622759
 
9.0%
e 2270868
 
7.8%
s 2132346
 
7.4%
r 2080887
 
7.2%
l 1789654
 
6.2%
u 1660949
 
5.7%
n 1617510
 
5.6%
t 1546059
 
5.3%
Other values (18) 7009866
24.2%
Uppercase Letter
ValueCountFrequency (%)
P 539981
14.8%
C 485059
13.3%
A 336052
 
9.2%
S 325425
 
8.9%
M 245659
 
6.7%
L 195244
 
5.3%
E 189255
 
5.2%
T 174807
 
4.8%
D 165471
 
4.5%
B 152665
 
4.2%
Other values (16) 841018
23.0%
Decimal Number
ValueCountFrequency (%)
0 4
26.7%
3 4
26.7%
6 3
20.0%
1 2
13.3%
4 1
 
6.7%
9 1
 
6.7%
Other Punctuation
ValueCountFrequency (%)
. 14196
99.8%
? 20
 
0.1%
/ 4
 
< 0.1%
! 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 46
65.7%
[ 24
34.3%
Close Punctuation
ValueCountFrequency (%)
) 46
65.7%
] 24
34.3%
Space Separator
ValueCountFrequency (%)
6200
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 49
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32655860
99.9%
Common 20634
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3570749
 
10.9%
i 2703577
 
8.3%
o 2622759
 
8.0%
e 2270868
 
7.0%
s 2132346
 
6.5%
r 2080887
 
6.4%
l 1789654
 
5.5%
u 1660949
 
5.1%
n 1617510
 
5.0%
t 1546059
 
4.7%
Other values (44) 10660502
32.6%
Common
ValueCountFrequency (%)
. 14196
68.8%
6200
30.0%
- 49
 
0.2%
( 46
 
0.2%
) 46
 
0.2%
[ 24
 
0.1%
] 24
 
0.1%
? 20
 
0.1%
_ 8
 
< 0.1%
0 4
 
< 0.1%
Other values (8) 17
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32676225
> 99.9%
None 268
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3570749
 
10.9%
i 2703577
 
8.3%
o 2622759
 
8.0%
e 2270868
 
6.9%
s 2132346
 
6.5%
r 2080887
 
6.4%
l 1789654
 
5.5%
u 1660949
 
5.1%
n 1617510
 
5.0%
t 1546059
 
4.7%
Other values (59) 10680867
32.7%
None
ValueCountFrequency (%)
ë 264
98.5%
ö 4
 
1.5%
Punctuation
ValueCountFrequency (%)
1
100.0%

subgenus
Text

Missing 

Distinct4536
Distinct (%)5.4%
Missing3729484
Missing (%)97.8%
Memory size29.1 MiB
2025-01-14T11:39:33.307003image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length17
Mean length10.12022691
Min length1

Characters and Unicode

Total characters856323
Distinct characters55
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1590 ?
Unique (%)1.9%

Sample

1st rowColobostylus
2nd rowTricyphona
3rd rowAngulus
4th rowCostellaria
5th rowAgathistoma
ValueCountFrequency (%)
pyrobombus 8813
 
10.4%
bombus 2923
 
3.5%
apis 1481
 
1.8%
thericium 1417
 
1.7%
fervidobombus 1384
 
1.6%
depressicambarus 1232
 
1.5%
ortmannicus 1037
 
1.2%
stephanoconus 1008
 
1.2%
neoxylocopa 981
 
1.2%
alpinobombus 647
 
0.8%
Other values (4526) 63703
75.3%
2025-01-14T11:39:33.579853image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 89127
 
10.4%
a 81161
 
9.5%
i 64482
 
7.5%
s 63807
 
7.5%
r 60943
 
7.1%
u 51272
 
6.0%
e 44787
 
5.2%
m 41692
 
4.9%
l 41013
 
4.8%
b 38453
 
4.5%
Other values (45) 279586
32.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 771663
90.1%
Uppercase Letter 84615
 
9.9%
Other Punctuation 34
 
< 0.1%
Space Separator 11
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 89127
11.5%
a 81161
10.5%
i 64482
 
8.4%
s 63807
 
8.3%
r 60943
 
7.9%
u 51272
 
6.6%
e 44787
 
5.8%
m 41692
 
5.4%
l 41013
 
5.3%
b 38453
 
5.0%
Other values (16) 194926
25.3%
Uppercase Letter
ValueCountFrequency (%)
P 18761
22.2%
C 9181
10.9%
A 8176
9.7%
S 6424
 
7.6%
T 5387
 
6.4%
M 5100
 
6.0%
B 4333
 
5.1%
L 3486
 
4.1%
D 3473
 
4.1%
N 3241
 
3.8%
Other values (16) 17053
20.2%
Other Punctuation
ValueCountFrequency (%)
. 32
94.1%
? 2
 
5.9%
Space Separator
ValueCountFrequency (%)
11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 856278
> 99.9%
Common 45
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 89127
 
10.4%
a 81161
 
9.5%
i 64482
 
7.5%
s 63807
 
7.5%
r 60943
 
7.1%
u 51272
 
6.0%
e 44787
 
5.2%
m 41692
 
4.9%
l 41013
 
4.8%
b 38453
 
4.5%
Other values (42) 279541
32.6%
Common
ValueCountFrequency (%)
. 32
71.1%
11
 
24.4%
? 2
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 856323
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 89127
 
10.4%
a 81161
 
9.5%
i 64482
 
7.5%
s 63807
 
7.5%
r 60943
 
7.1%
u 51272
 
6.0%
e 44787
 
5.2%
m 41692
 
4.9%
l 41013
 
4.8%
b 38453
 
4.5%
Other values (45) 279586
32.6%

infragenericEpithet
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:33.645394image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length16
Mean length16
Min length14

Characters and Unicode

Total characters32
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowCarex maculata
2nd rowTursiops truncatus
ValueCountFrequency (%)
carex 1
25.0%
maculata 1
25.0%
tursiops 1
25.0%
truncatus 1
25.0%
2025-01-14T11:39:33.768459image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5
15.6%
u 4
12.5%
r 3
9.4%
s 3
9.4%
t 3
9.4%
2
 
6.2%
c 2
 
6.2%
T 1
 
3.1%
p 1
 
3.1%
o 1
 
3.1%
Other values (7) 7
21.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 28
87.5%
Space Separator 2
 
6.2%
Uppercase Letter 2
 
6.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5
17.9%
u 4
14.3%
r 3
10.7%
s 3
10.7%
t 3
10.7%
c 2
 
7.1%
p 1
 
3.6%
o 1
 
3.6%
i 1
 
3.6%
l 1
 
3.6%
Other values (4) 4
14.3%
Uppercase Letter
ValueCountFrequency (%)
T 1
50.0%
C 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30
93.8%
Common 2
 
6.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5
16.7%
u 4
13.3%
r 3
10.0%
s 3
10.0%
t 3
10.0%
c 2
 
6.7%
T 1
 
3.3%
p 1
 
3.3%
o 1
 
3.3%
i 1
 
3.3%
Other values (6) 6
20.0%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5
15.6%
u 4
12.5%
r 3
9.4%
s 3
9.4%
t 3
9.4%
2
 
6.2%
c 2
 
6.2%
T 1
 
3.1%
p 1
 
3.1%
o 1
 
3.1%
Other values (7) 7
21.9%

specificEpithet
Text

Missing 

Distinct136016
Distinct (%)3.8%
Missing190700
Missing (%)5.0%
Memory size29.1 MiB
2025-01-14T11:39:33.991691image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length28
Mean length8.563882697
Min length1

Characters and Unicode

Total characters31030364
Distinct characters58
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56290 ?
Unique (%)1.6%

Sample

1st rowlescurii
2nd rowochrophaeus
3rd rowkinbergi
4th rowadelphus
5th rowcatoptrophori
ValueCountFrequency (%)
sp 223489
 
6.2%
cinereus 33845
 
0.9%
americana 8943
 
0.2%
gracilis 8698
 
0.2%
canadensis 7759
 
0.2%
maniculatus 6546
 
0.2%
occidentalis 6486
 
0.2%
fuscus 6485
 
0.2%
elegans 6236
 
0.2%
montanus 6177
 
0.2%
Other values (135825) 3311792
91.3%
2025-01-14T11:39:34.299516image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3916798
12.6%
i 3443476
11.1%
s 2774878
 
8.9%
e 2213681
 
7.1%
r 2045039
 
6.6%
u 1975526
 
6.4%
n 1922945
 
6.2%
l 1895407
 
6.1%
t 1674637
 
5.4%
o 1656555
 
5.3%
Other values (48) 7511422
24.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 30788068
99.2%
Other Punctuation 229903
 
0.7%
Dash Punctuation 8742
 
< 0.1%
Space Separator 3057
 
< 0.1%
Decimal Number 443
 
< 0.1%
Connector Punctuation 112
 
< 0.1%
Open Punctuation 18
 
< 0.1%
Close Punctuation 18
 
< 0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3916798
12.7%
i 3443476
11.2%
s 2774878
 
9.0%
e 2213681
 
7.2%
r 2045039
 
6.6%
u 1975526
 
6.4%
n 1922945
 
6.2%
l 1895407
 
6.2%
t 1674637
 
5.4%
o 1656555
 
5.4%
Other values (20) 7269126
23.6%
Decimal Number
ValueCountFrequency (%)
1 156
35.2%
2 61
 
13.8%
0 48
 
10.8%
3 42
 
9.5%
9 37
 
8.4%
4 25
 
5.6%
6 23
 
5.2%
7 20
 
4.5%
8 16
 
3.6%
5 15
 
3.4%
Other Punctuation
ValueCountFrequency (%)
. 226830
98.7%
" 2914
 
1.3%
' 58
 
< 0.1%
/ 51
 
< 0.1%
? 27
 
< 0.1%
# 18
 
< 0.1%
, 3
 
< 0.1%
; 1
 
< 0.1%
1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 16
88.9%
[ 2
 
11.1%
Close Punctuation
ValueCountFrequency (%)
) 16
88.9%
] 2
 
11.1%
Math Symbol
ValueCountFrequency (%)
~ 2
66.7%
= 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 8742
100.0%
Space Separator
ValueCountFrequency (%)
3057
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 112
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30788068
99.2%
Common 242296
 
0.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3916798
12.7%
i 3443476
11.2%
s 2774878
 
9.0%
e 2213681
 
7.2%
r 2045039
 
6.6%
u 1975526
 
6.4%
n 1922945
 
6.2%
l 1895407
 
6.2%
t 1674637
 
5.4%
o 1656555
 
5.4%
Other values (20) 7269126
23.6%
Common
ValueCountFrequency (%)
. 226830
93.6%
- 8742
 
3.6%
3057
 
1.3%
" 2914
 
1.2%
1 156
 
0.1%
_ 112
 
< 0.1%
2 61
 
< 0.1%
' 58
 
< 0.1%
/ 51
 
< 0.1%
0 48
 
< 0.1%
Other values (18) 267
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31030323
> 99.9%
None 40
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3916798
12.6%
i 3443476
11.1%
s 2774878
 
8.9%
e 2213681
 
7.1%
r 2045039
 
6.6%
u 1975526
 
6.4%
n 1922945
 
6.2%
l 1895407
 
6.1%
t 1674637
 
5.4%
o 1656555
 
5.3%
Other values (43) 7511381
24.2%
None
ValueCountFrequency (%)
ë 27
67.5%
ü 10
 
25.0%
ñ 2
 
5.0%
æ 1
 
2.5%
Punctuation
ValueCountFrequency (%)
1
100.0%

infraspecificEpithet
Text

Missing 

Distinct24030
Distinct (%)5.6%
Missing3381784
Missing (%)88.7%
Memory size29.1 MiB
2025-01-14T11:39:34.517841image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length33
Median length29
Mean length8.964754866
Min length1

Characters and Unicode

Total characters3875598
Distinct characters47
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8599 ?
Unique (%)2.0%

Sample

1st rowcinnamomina
2nd rowberlandieri
3rd rowmellodora
4th rowrubiginosa
5th rowspergulariiforme
ValueCountFrequency (%)
noveboracensis 2209
 
0.5%
domesticus 2097
 
0.5%
acuminatum 1842
 
0.4%
pennsylvanicus 1771
 
0.4%
cinereus 1593
 
0.4%
carolinensis 1550
 
0.4%
talpoides 1538
 
0.4%
minor 1414
 
0.3%
occidentalis 1410
 
0.3%
gambelii 1301
 
0.3%
Other values (23958) 416193
96.1%
2025-01-14T11:39:34.808723image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 453190
11.7%
a 451397
11.6%
s 381313
9.8%
e 296985
 
7.7%
n 268978
 
6.9%
r 258031
 
6.7%
u 250988
 
6.5%
l 227869
 
5.9%
o 216334
 
5.6%
c 196282
 
5.1%
Other values (37) 874231
22.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3874278
> 99.9%
Space Separator 603
 
< 0.1%
Dash Punctuation 341
 
< 0.1%
Other Punctuation 275
 
< 0.1%
Uppercase Letter 38
 
< 0.1%
Open Punctuation 30
 
< 0.1%
Close Punctuation 30
 
< 0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 453190
11.7%
a 451397
11.7%
s 381313
9.8%
e 296985
 
7.7%
n 268978
 
6.9%
r 258031
 
6.7%
u 250988
 
6.5%
l 227869
 
5.9%
o 216334
 
5.6%
c 196282
 
5.1%
Other values (18) 872911
22.5%
Uppercase Letter
ValueCountFrequency (%)
I 11
28.9%
F 11
28.9%
C 6
15.8%
B 2
 
5.3%
A 2
 
5.3%
O 2
 
5.3%
H 2
 
5.3%
V 1
 
2.6%
D 1
 
2.6%
Other Punctuation
ValueCountFrequency (%)
. 232
84.4%
? 17
 
6.2%
' 15
 
5.5%
" 6
 
2.2%
/ 5
 
1.8%
Space Separator
ValueCountFrequency (%)
603
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 341
100.0%
Open Punctuation
ValueCountFrequency (%)
( 30
100.0%
Close Punctuation
ValueCountFrequency (%)
) 30
100.0%
Math Symbol
ValueCountFrequency (%)
× 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3874316
> 99.9%
Common 1282
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 453190
11.7%
a 451397
11.7%
s 381313
9.8%
e 296985
 
7.7%
n 268978
 
6.9%
r 258031
 
6.7%
u 250988
 
6.5%
l 227869
 
5.9%
o 216334
 
5.6%
c 196282
 
5.1%
Other values (27) 872949
22.5%
Common
ValueCountFrequency (%)
603
47.0%
- 341
26.6%
. 232
 
18.1%
( 30
 
2.3%
) 30
 
2.3%
? 17
 
1.3%
' 15
 
1.2%
" 6
 
0.5%
/ 5
 
0.4%
× 3
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3875593
> 99.9%
None 5
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 453190
11.7%
a 451397
11.6%
s 381313
9.8%
e 296985
 
7.7%
n 268978
 
6.9%
r 258031
 
6.7%
u 250988
 
6.5%
l 227869
 
5.9%
o 216334
 
5.6%
c 196282
 
5.1%
Other values (34) 874226
22.6%
None
ValueCountFrequency (%)
× 3
60.0%
ß 1
 
20.0%
ë 1
 
20.0%

taxonRank
Text

Missing 

Distinct34
Distinct (%)< 0.1%
Missing3381907
Missing (%)88.7%
Memory size29.1 MiB
2025-01-14T11:39:34.880197image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length10
Mean length9.351105527
Min length2

Characters and Unicode

Total characters4041473
Distinct characters32
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowsubspecies
2nd rowsubspecies
3rd rowvariety
4th rowsubspecies
5th rowsubspecies
ValueCountFrequency (%)
subspecies 341855
79.1%
variety 85773
 
19.8%
forma 3328
 
0.8%
var 898
 
0.2%
form 78
 
< 0.1%
aberration 71
 
< 0.1%
race 33
 
< 0.1%
subvariety 32
 
< 0.1%
aff 31
 
< 0.1%
nothosubsp 26
 
< 0.1%
Other values (19) 72
 
< 0.1%
2025-01-14T11:39:35.005394image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 1025667
25.4%
e 769648
19.0%
i 427740
10.6%
b 341987
 
8.5%
u 341923
 
8.5%
p 341908
 
8.5%
c 341902
 
8.5%
r 90283
 
2.2%
a 90204
 
2.2%
t 85928
 
2.1%
Other values (22) 184283
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4028226
99.7%
Uppercase Letter 12269
 
0.3%
Other Punctuation 965
 
< 0.1%
Space Separator 5
 
< 0.1%
Open Punctuation 4
 
< 0.1%
Close Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 1025667
25.5%
e 769648
19.1%
i 427740
10.6%
b 341987
 
8.5%
u 341923
 
8.5%
p 341908
 
8.5%
c 341902
 
8.5%
r 90283
 
2.2%
a 90204
 
2.2%
t 85928
 
2.1%
Other values (11) 171036
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
V 12023
98.0%
F 132
 
1.1%
A 68
 
0.6%
R 29
 
0.2%
M 9
 
0.1%
U 4
 
< 0.1%
C 4
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 965
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 4
100.0%
Close Punctuation
ValueCountFrequency (%)
] 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4040495
> 99.9%
Common 978
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 1025667
25.4%
e 769648
19.0%
i 427740
10.6%
b 341987
 
8.5%
u 341923
 
8.5%
p 341908
 
8.5%
c 341902
 
8.5%
r 90283
 
2.2%
a 90204
 
2.2%
t 85928
 
2.1%
Other values (18) 183305
 
4.5%
Common
ValueCountFrequency (%)
. 965
98.7%
5
 
0.5%
[ 4
 
0.4%
] 4
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4041473
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 1025667
25.4%
e 769648
19.0%
i 427740
10.6%
b 341987
 
8.5%
u 341923
 
8.5%
p 341908
 
8.5%
c 341902
 
8.5%
r 90283
 
2.2%
a 90204
 
2.2%
t 85928
 
2.1%
Other values (22) 184283
 
4.6%
Distinct66565
Distinct (%)2.8%
Missing1431500
Missing (%)37.5%
Memory size29.1 MiB
2025-01-14T11:39:35.202214image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length255
Median length65
Mean length10.72469308
Min length2

Characters and Unicode

Total characters25552643
Distinct characters112
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18307 ?
Unique (%)0.8%

Sample

1st row(A. Gray) S. Watson
2nd rowEhlers
3rd rowSelys
4th rowBadley
5th rowPaulson
ValueCountFrequency (%)
275794
 
6.1%
l 269565
 
6.0%
ex 120679
 
2.7%
a 76166
 
1.7%
dc 56611
 
1.3%
gray 48180
 
1.1%
kunth 44183
 
1.0%
linnaeus 41860
 
0.9%
benth 41199
 
0.9%
sw 36382
 
0.8%
Other values (17753) 3505839
77.6%
2025-01-14T11:39:35.487921image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 2374881
 
9.3%
2133859
 
8.4%
e 1748727
 
6.8%
r 1301606
 
5.1%
a 1260238
 
4.9%
n 1150287
 
4.5%
l 1072305
 
4.2%
( 1016637
 
4.0%
) 1016637
 
4.0%
i 905553
 
3.5%
Other values (102) 11571913
45.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14358899
56.2%
Uppercase Letter 4346097
 
17.0%
Other Punctuation 2659953
 
10.4%
Space Separator 2133859
 
8.4%
Open Punctuation 1016637
 
4.0%
Close Punctuation 1016637
 
4.0%
Dash Punctuation 20561
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1748727
12.2%
r 1301606
 
9.1%
a 1260238
 
8.8%
n 1150287
 
8.0%
l 1072305
 
7.5%
i 905553
 
6.3%
o 904808
 
6.3%
t 796053
 
5.5%
s 739601
 
5.2%
u 621804
 
4.3%
Other values (54) 3857917
26.9%
Uppercase Letter
ValueCountFrequency (%)
L 513452
11.8%
S 434070
 
10.0%
B 332546
 
7.7%
H 313612
 
7.2%
M 311417
 
7.2%
C 300680
 
6.9%
R 234805
 
5.4%
A 223211
 
5.1%
G 216969
 
5.0%
D 209797
 
4.8%
Other values (27) 1255538
28.9%
Other Punctuation
ValueCountFrequency (%)
. 2374881
89.3%
& 276039
 
10.4%
' 6061
 
0.2%
, 1799
 
0.1%
\ 1165
 
< 0.1%
? 5
 
< 0.1%
; 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2133859
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1016637
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1016637
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 20561
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18704996
73.2%
Common 6847647
 
26.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1748727
 
9.3%
r 1301606
 
7.0%
a 1260238
 
6.7%
n 1150287
 
6.1%
l 1072305
 
5.7%
i 905553
 
4.8%
o 904808
 
4.8%
t 796053
 
4.3%
s 739601
 
4.0%
u 621804
 
3.3%
Other values (91) 8204014
43.9%
Common
ValueCountFrequency (%)
. 2374881
34.7%
2133859
31.2%
( 1016637
14.8%
) 1016637
14.8%
& 276039
 
4.0%
- 20561
 
0.3%
' 6061
 
0.1%
, 1799
 
< 0.1%
\ 1165
 
< 0.1%
? 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25457851
99.6%
None 94792
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 2374881
 
9.3%
2133859
 
8.4%
e 1748727
 
6.9%
r 1301606
 
5.1%
a 1260238
 
5.0%
n 1150287
 
4.5%
l 1072305
 
4.2%
( 1016637
 
4.0%
) 1016637
 
4.0%
i 905553
 
3.6%
Other values (53) 11477121
45.1%
None
ValueCountFrequency (%)
ü 33352
35.2%
é 18990
20.0%
ö 11902
 
12.6%
è 8126
 
8.6%
ä 4210
 
4.4%
á 3601
 
3.8%
Á 3155
 
3.3%
ø 2575
 
2.7%
ó 1563
 
1.6%
Ø 1268
 
1.3%
Other values (39) 6050
 
6.4%

vernacularName
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing3814096
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:35.568211image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length84
Median length57
Mean length49.66666667
Min length8

Characters and Unicode

Total characters149
Distinct characters28
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowPlantae, Monocotyledonae, Poales, Cyperaceae, Cyperoideae
2nd rowHolotype
3rd rowAnimalia, Chordata, Vertebrata, Mammalia, Eutheria, Cetacea, Odontoceti, Delphinidae
ValueCountFrequency (%)
plantae 1
 
7.1%
monocotyledonae 1
 
7.1%
poales 1
 
7.1%
cyperaceae 1
 
7.1%
cyperoideae 1
 
7.1%
holotype 1
 
7.1%
animalia 1
 
7.1%
chordata 1
 
7.1%
vertebrata 1
 
7.1%
mammalia 1
 
7.1%
Other values (4) 4
28.6%
2025-01-14T11:39:35.707061image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 20
13.4%
e 19
12.8%
, 11
 
7.4%
11
 
7.4%
o 11
 
7.4%
t 10
 
6.7%
i 8
 
5.4%
l 7
 
4.7%
r 6
 
4.0%
n 6
 
4.0%
Other values (18) 40
26.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 113
75.8%
Uppercase Letter 14
 
9.4%
Other Punctuation 11
 
7.4%
Space Separator 11
 
7.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 20
17.7%
e 19
16.8%
o 11
9.7%
t 10
8.8%
i 8
 
7.1%
l 7
 
6.2%
r 6
 
5.3%
n 6
 
5.3%
d 5
 
4.4%
p 4
 
3.5%
Other values (7) 17
15.0%
Uppercase Letter
ValueCountFrequency (%)
C 4
28.6%
P 2
14.3%
M 2
14.3%
H 1
 
7.1%
A 1
 
7.1%
V 1
 
7.1%
E 1
 
7.1%
O 1
 
7.1%
D 1
 
7.1%
Other Punctuation
ValueCountFrequency (%)
, 11
100.0%
Space Separator
ValueCountFrequency (%)
11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 127
85.2%
Common 22
 
14.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 20
15.7%
e 19
15.0%
o 11
 
8.7%
t 10
 
7.9%
i 8
 
6.3%
l 7
 
5.5%
r 6
 
4.7%
n 6
 
4.7%
d 5
 
3.9%
p 4
 
3.1%
Other values (16) 31
24.4%
Common
ValueCountFrequency (%)
, 11
50.0%
11
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 149
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 20
13.4%
e 19
12.8%
, 11
 
7.4%
11
 
7.4%
o 11
 
7.4%
t 10
 
6.7%
i 8
 
5.4%
l 7
 
4.7%
r 6
 
4.0%
n 6
 
4.0%
Other values (18) 40
26.8%

nomenclaturalCode
Text

Missing 

Distinct5
Distinct (%)100.0%
Missing3814094
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:35.777118image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length11
Mean length11.4
Min length7

Characters and Unicode

Total characters57
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)100.0%

Sample

1st rowCram, E. B.
2nd rowHowell, Tiffany
3rd rowPlantae
4th rowMaccallum, G. A.
5th rowAnimalia
ValueCountFrequency (%)
cram 1
10.0%
e 1
10.0%
b 1
10.0%
howell 1
10.0%
tiffany 1
10.0%
plantae 1
10.0%
maccallum 1
10.0%
g 1
10.0%
a 1
10.0%
animalia 1
10.0%
2025-01-14T11:39:35.894299image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
14.0%
l 6
 
10.5%
5
 
8.8%
. 4
 
7.0%
m 3
 
5.3%
, 3
 
5.3%
n 3
 
5.3%
i 3
 
5.3%
e 2
 
3.5%
c 2
 
3.5%
Other values (16) 18
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 35
61.4%
Uppercase Letter 10
 
17.5%
Other Punctuation 7
 
12.3%
Space Separator 5
 
8.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
22.9%
l 6
17.1%
m 3
 
8.6%
n 3
 
8.6%
i 3
 
8.6%
e 2
 
5.7%
c 2
 
5.7%
f 2
 
5.7%
w 1
 
2.9%
r 1
 
2.9%
Other values (4) 4
11.4%
Uppercase Letter
ValueCountFrequency (%)
A 2
20.0%
T 1
10.0%
H 1
10.0%
B 1
10.0%
P 1
10.0%
M 1
10.0%
E 1
10.0%
G 1
10.0%
C 1
10.0%
Other Punctuation
ValueCountFrequency (%)
. 4
57.1%
, 3
42.9%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 45
78.9%
Common 12
 
21.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
17.8%
l 6
13.3%
m 3
 
6.7%
n 3
 
6.7%
i 3
 
6.7%
e 2
 
4.4%
c 2
 
4.4%
f 2
 
4.4%
A 2
 
4.4%
w 1
 
2.2%
Other values (13) 13
28.9%
Common
ValueCountFrequency (%)
5
41.7%
. 4
33.3%
, 3
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 57
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
14.0%
l 6
 
10.5%
5
 
8.8%
. 4
 
7.0%
m 3
 
5.3%
, 3
 
5.3%
n 3
 
5.3%
i 3
 
5.3%
e 2
 
3.5%
c 2
 
3.5%
Other values (16) 18
31.6%

taxonomicStatus
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing3814098
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:35.942171image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowChordata
ValueCountFrequency (%)
chordata 1
100.0%
2025-01-14T11:39:36.040648image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
25.0%
C 1
12.5%
h 1
12.5%
o 1
12.5%
r 1
12.5%
d 1
12.5%
t 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
87.5%
Uppercase Letter 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
28.6%
h 1
14.3%
o 1
14.3%
r 1
14.3%
d 1
14.3%
t 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
C 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
25.0%
C 1
12.5%
h 1
12.5%
o 1
12.5%
r 1
12.5%
d 1
12.5%
t 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
25.0%
C 1
12.5%
h 1
12.5%
o 1
12.5%
r 1
12.5%
d 1
12.5%
t 1
12.5%

nomenclaturalStatus
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:36.091686image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length11.5
Mean length11.5
Min length8

Characters and Unicode

Total characters23
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowMonocotyledonae
2nd rowMammalia
ValueCountFrequency (%)
monocotyledonae 1
50.0%
mammalia 1
50.0%
2025-01-14T11:39:36.207470image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 4
17.4%
a 4
17.4%
M 2
8.7%
n 2
8.7%
l 2
8.7%
e 2
8.7%
m 2
8.7%
c 1
 
4.3%
t 1
 
4.3%
y 1
 
4.3%
Other values (2) 2
8.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 21
91.3%
Uppercase Letter 2
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 4
19.0%
a 4
19.0%
n 2
9.5%
l 2
9.5%
e 2
9.5%
m 2
9.5%
c 1
 
4.8%
t 1
 
4.8%
y 1
 
4.8%
d 1
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
M 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 4
17.4%
a 4
17.4%
M 2
8.7%
n 2
8.7%
l 2
8.7%
e 2
8.7%
m 2
8.7%
c 1
 
4.3%
t 1
 
4.3%
y 1
 
4.3%
Other values (2) 2
8.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 4
17.4%
a 4
17.4%
M 2
8.7%
n 2
8.7%
l 2
8.7%
e 2
8.7%
m 2
8.7%
c 1
 
4.3%
t 1
 
4.3%
y 1
 
4.3%
Other values (2) 2
8.7%

taxonRemarks
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing3814097
Missing (%)> 99.9%
Memory size29.1 MiB
2025-01-14T11:39:36.253870image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6.5
Mean length6.5
Min length6

Characters and Unicode

Total characters13
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowPoales
2nd rowCetacea
ValueCountFrequency (%)
poales 1
50.0%
cetacea 1
50.0%
2025-01-14T11:39:36.360572image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
23.1%
e 3
23.1%
P 1
 
7.7%
o 1
 
7.7%
l 1
 
7.7%
s 1
 
7.7%
C 1
 
7.7%
t 1
 
7.7%
c 1
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11
84.6%
Uppercase Letter 2
 
15.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
27.3%
e 3
27.3%
o 1
 
9.1%
l 1
 
9.1%
s 1
 
9.1%
t 1
 
9.1%
c 1
 
9.1%
Uppercase Letter
ValueCountFrequency (%)
P 1
50.0%
C 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
23.1%
e 3
23.1%
P 1
 
7.7%
o 1
 
7.7%
l 1
 
7.7%
s 1
 
7.7%
C 1
 
7.7%
t 1
 
7.7%
c 1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
23.1%
e 3
23.1%
P 1
 
7.7%
o 1
 
7.7%
l 1
 
7.7%
s 1
 
7.7%
C 1
 
7.7%
t 1
 
7.7%
c 1
 
7.7%